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Universal Memory Architectures for Autonomous 

Machines 

Dan P. Guralnik^^, Daniel E. Koditschek^ 


Abstract —We propose a self-organizing memory architecture 
for perceptual experience capable of snpporting autonomous 
learning and goal-directed problem solving in the absence of 
any prior information abont the agent’s environment. The archi¬ 
tecture is simple enough to ensure (1) a quadratic bound (in the 
number of available sensors) on space requirements, and (2) a 
quadratic bound on the time-complexity of the npdate-execnte 
cycle. At the same time, it is snfficiently complex to provide the 
agent with an internal representation which is (3) minimal among 
all representations of its class which account for every sensory 
equivalence class subject to the agent’s belief state; (4) capable, in 
principle, of recovering the homotopy type of the system’s state 
space; (5) learnable with arbitrary precision through a random 
application of the available actions. 

The provable properties of an effectively trained memory 
structure exploit a duality between weak poc sets — a symbolic 
(discrete) representation of snbset nesting relations — and non- 
positively curved cubical complexes, whose rich convexity theory 
underlie the planning cycle of the proposed architecture. 


I. Introduction 

A. Motivation 

A major obstacle to autonomous systems synthesis is the 
absence of a capacious but efficient memory architecture. In 
humans, memory influences behaviour over a wide range of 
time scales, leading to the emergence of what seems to be a 
functional hierarchy of sub-systems IT]: from non-declarative 
vs. declarative through the split of declarative memory into 
semantic and episodic 121; and on to theories of attention and 
recall 13. This variety of scales is mirrored in the collection of 
problems addressed by the synthetic sciences: from learning 
dependable actions/motion primitives a, a; through learning 
objects and their affordances a, Q to demonstration-driven 
task execution HI, 0; through exploring and mapping an un¬ 
known environment ifTOl . ifTTI . lfT3 . lfT3l and motion planning 
na, m, M; and on to general problem solving ca using 
artificial general intelligence (AGI) architectures ca, II3, 

Eo). 

One idea stands out as common to all these approaches, be¬ 
ginning with the formal notion of a problem space introduced 
by Newell and Simon ini, ED: the purpose of a memory 
architecture is to learn the transition structure of the state space 
X of the system comprised of the agent and its environment 
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E while processing the history of observations into a format 
facilitating improved future control. 

It is often argued (e.g. Il22ll . ED, ED) that memory 
architectures for general agents should enjoy a high degree 
of domain- and task-independence. In general, however, clear 
definitions of notions such as ‘domain’ and ‘task’ are not 
readily forthcoming across the vast breadth of literatures dis¬ 
cussing memory, agents and autonomy. Notions of ‘universal 
learners’ have been proposed E5l based on optimizing gain in 
estimators of predictive entropy, however there is evidence to 
suggest that the resulting level of generality may be insufficient 
for some tasks ESj . 

Absent broadly recognized formal foundations, we advance 
an architecture provably satisfying intuitive universality prop¬ 
erties, including, most centrally: (1) interactions with the 
environment are encoded in the most generic, yet minimal, 
manner possible, while requiring no prior semantic informa¬ 
tion; and (2) learning obtains from direct binary sensory input, 
automatically developing appropriate contextual links between 
sensations of arbitrary modality. A key outcome is that the 
architecture encodes its observation history in a model space 
that supports the agent’s problem solving as a form of reac¬ 
tive motion planning whereby atomic computations provably 
correspond to nearest point projection in the reachable set. 

B. Contribution 

We consider a generic discrete binary agent (DBA): a ma¬ 
chine sensing and interacting with its environment in discrete 
time, equipped with a finite collection E of Boolean-valued 
sensors, some of which serve as triggers for actions/behaviors 
(switched on and off at will). 

Given an instance of a DBA interacting with an environment 
E, it is natural to view the set S of sensory equivalence 
classes of the associated transition system X as a subset of 
the power set {0,1}^. It is generally accepted ll27l . ESl that 
a memory architecture must be capable of supporting internal 
representations rich enough to account for the diversity ll29ll 
of the transition system X: Exact problem solving, when 
construed as abstract motion planning, requires an internal 
representation capable, eventually, of accounting for all the 
classes in 5 and the transitions between them. Unfortunately, 
as expressed forcefully in ll29ll and as we review at length 
below, the task of obtaining an exact description of 5 becomes 
intractable in the absence of strong simplifying assumptions 
about X, as the number of sensors grows. 

To circumvent this obstacle, rather than imposing any spe¬ 
cific structure on X, we propose to relax the requirement for 
precise reconstruction by introducing an approximation whose 
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discrepancy from S we characterize exactly and show to be 
the smallest possible in its (computationally effective) class of 
objects. 

The new memory and control architecture we propose here 
consists of two layers; 


A data structure S - called a snapshot - keeping track of 
the current state and summarizing observations in terms 
of collection of real-valued registers, of size quadratic 
in the number of sensors, summarizing the history of 
observations made by the agents. 

A reactive planner, built on a weak poc set structure 
P (IMl, ED and defn. |A.1[ ) constituting a record of 
pairwise implications among the atomic sensations as 
observed by the agent; P is computed from S in each 
control cycle. 


A crucial property of our architecture is that P and M are 
formally reconstructible from each other. The model space 
M takes the form of a CAT(O) cubical complex, or 
whose 0-skeleton is contained in {0,1}^. As the snapshot S is 
updated by incoming observations, the space M, as encoded 
by P, is transformed along with it. We can state our main 
contributions - albeit, necessarily, informally at this point - in 
terms of provable properties of the architecture and its model 
spaces; 


(i) 


(ii) 


(iii) 


(iv) 


Universality of Representation. M is the minimal 
model guaranteed to represent all the sensory equivalence 
classes of any sensorium S satisfying the record P (see 
A-E3|l. In particular, in the absence of information not 


already encoded in P, it is impossible to distinguish the 
0-skeleton of M from the set of sensory equivalence 
classes, S. 

Topological Approximation. As a topological space, 
M is always contractibl^ Provided a sufficiently rich 
sensorium, the sub-complex C M of faces all of 
whose vertices lie in 5 inherits from M the topologj0of 
the observed space X (see appendix A-E3||A^P4) l. 
Low-complexity, effective learning. The proposed ar¬ 
chitecture requires quadratic space (in the number of 
sensors) for storage, and no more than quadratic time 
for updating. Eurthermore, an agent picking actions at 
random learns an approximation of the resulting walk’s 
limiting distribution on X (see |II-E2[ ). 

Efficiency of Planning. Planning the next action given 
a target sensation takes quadratic time in the number of 
sensors, while eliminating the need for searching in the 
model space. With sufficient parallel processing power, 
this bound may be reduced to a constant multiple of the 
height — the maximum length of a chain of implications 
of P (see [HTB] ). 


To the best of our knowledge, this combination of provable 
properties has not previously appeared in the literature. 

^For a good introduction CAT(O) cubical complexes, see on For a tutorial 
on cell complexes see ED, chapter 0 and appendix. 

^The formal notion of being ‘hole-free’ — see ED. chapter 0. 

^Up to homotopy equivalence — see definition in EU. chapter 0. 


C. Overview and Related Literature 

To establish the novelty of our contribution we now briefly 
review the copious literature bearing on these topics as arising 
from three distinct traditions; robotics; connectionist compu¬ 
tation; and artificial general intelligence. After presenting our 
technical ideas we will explore at the end of the paper in a 
more discursive form their relation to and implications for the 
broader field. 

1) Relation to Mapping and Navigation: Eormulating navi¬ 
gation and mapping problems in terms of a point agent moving 
through a homotopically trivial ambient space while avoiding 
a collection of geometrically defined obstacle regions repre¬ 
senting forbidden states is fundamental to motion planning 
ca, M and mapping ifT^ . 1341 . The ubiquity of obstacles 
in these settings introduces topological considerations whose 
primacy is well established in the algorithmic literature 041 . 
ES, Ea, E3, EH, ED, EQl, governing the complexity of 
not only motion planning ED but even set membership sa. 

Our strategy is to reduce the general problem of memory 
storage and its use for motion planning in the underlying 
transition structure of a problem space (as sensed by a DBA) 
to the geometric problem of motion planning in the agent’s 
model space M (playing the role traditionally assigned to 
Euclidean space). Generalizing the Euclidean setting, M has 
a very strong convexity theory ED, ED enabling low-cost 
greedy navigation. 

The topological point of view has been shown to be well 
warranted in the discrete setting as well. As was demonstrated 
by Pratt B4ll . oriented topological structures (cubical com¬ 
plexes, in fact) may be used to encode the causal relations 
among actions and states in symbolic transition systems. Ap¬ 
proaches generalizing Pratt’s have since been used to formulate 
very general models of reconfigurability and self-assembly 

Ea, EH- 

2 ) Mechanisms for Learning and Planning: Snapshots use 
an evolving estimate of pairwise intersections of sensor foot¬ 
prints to form a record of implications among the atomic 
sensations of the DBA. The necessity of such a record for 
planning goes back (at least) to E3, yet ideas about applying 
it as a way to encode context are fairly recent and specialized 
El, ES, ED- Our internal representation takes the additional 
step of applying this principle to all the sensations available 
to a DBA, including the control signals it uses to interact with 
its environment. 

The resulting learning and control mechanisms may be 
realized in a highly simplified and idealized, yet highly plastic, 
network of neuron-like cells simulating the structure of P 
Ea. This analogy with neural networks is not a coincidence; 
estimating arbitrary intersections from near-synchronous ac¬ 
tivation of sensors in a planar sensor field has been explored 
as a means for topological ll^ as well as metric mapping by 
competitive attractor networks (RatSLAM flAll . Il2jll . ESI), 
as the study of the structure of stability properties vis-a-vis 
topology and plasticity in more general networks is just taking 

off Ei, El- 

3} Model Spaces: The necessity to maintain high¬ 
dimensional representations of the state space X poses a 
major challenge for current approaches to learning ED, EH, 
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m and general problem solving 153], ll54l . The method 
closest to ours in its formalism seems to be that of 1^ — 
and even lends itself to learning by a connectionist network 
ll55l — but still requires an exponentially large representation 
for planning purposes. By contrast, in our case, the ability 
to translate action planning in M into what is essentially a 
flow problem in a network constructed from the underlying 
sensorium obviates the need for maintaining M in memory, 
allowing us to evade the curse of dimensionality. Nevertheless, 
we are still guaranteed a model space that is sufficiently rich 
to account for all sensory equivalence classes perceivable by 
the DBA Il27l . 

The computational advantages of our approach come at a 
cost that is largely driven by topology, as expressed in (ii); 
M necessarily has trivial topologjj^ ||5^ . ||3T1 , 1571 . and our 
own result lEl establishes formal conditions on E under 
which the complex reproduces the ’’topological shape’j^ 
of X (as discussed above), which may not be topologically 
trivial. The basic algorithm driving planning in our agents, 
however, achieves its efficiency by disregarding this mismatch. 
The introduction of auxiliary intrinsic motivation mechanisms 
l58l . l59l as a means of steering the agent away from obstacle 
states in M and towards desirable behaviours (not necessarily 
states!) seems to be a possible way out of this predicament, 
as well as towards a solution of the problem of closing the 
control loop. At this early stage, as a feasibility study for 
the overall approach, we only consider very simple excitation 
mechanisms causing the agent to choose actions with the 
desire to maximize immediate excitation gain, to the extent 
that may be sensed by E (and otherwise to choose random 
actions). 


D. Organization of the Paper 

Having already given proofs of the formal results underlying 
(i) and (ii) in our previous paper ll^ . we defer the technical 
discussion of poc sets to appendix This appendix is 
intended as an introductory overview of the theory of weak 
poc sets and the geometry of their dual spaces - our agents’ 
model spaces — as well as a repository of proofs of technical 
results we require but could not hnd elsewhere in the literature. 

Section discusses (iii). We formally state the observation 
model for DBAs, describe snapshots and their learning mech¬ 
anisms, and present our early numerical work illustrating the 
practical implications of the claims regarding learning. 

Sectioning is dedicated to item (iv) in the list of contribu¬ 
tions. Actions are introduced to the observation model, and 
control algorithms are defined and validated. 

Finally, following an extended discussion of our results in 
relation to existing literature in section IV and the aforemen¬ 
tioned appendix dealing with poc sets, a second appendix 
presents the proofs of technical results about snapshots. 


II. Snapshots: From Observation Sequences to a 
Memory Structure 

We begin with a formal statement of what we mean by a 
DBA and its observation model. We then proceed to construct 


TABLE I 

Table of Mathematical Symbols 



Topic/Notation 

Ref. 


DBA Model (general) 


E 

Environment (with points p,q ,...) 

Sec 

II-Al 


X 

State space of the experiment (with points 

Sec 

II-Al 



x,y ,...) 





pos 

The position map X ^ E 

Sec 

II-Al 


T 

Time, the set of integers 

Sec 

II-A2 


L 

Reads as: ”at time f” 

Eqn.{T} 



DBA model (sensing) 


s 

Sensorium (elements are a, b,c ,..with 

Eqn.Jbl 



involution a ^ a* 





p 

Realization map of the sensorium S 

Eqn.j^ 


{a : x) 

Evaluation, e.g. of a E E on x E X 

Eqn 





DBA compntational model (at time t) 


sir 

Agent’s snapshot 

Sec 

III-Bl 


r|r 

The derived poc graph, Dir(S|^) 

Sec 

III-Bl 


P|t 

Derived (weak) poc set structure on S, 

Sec 

II-BI 



Poc(S J 






The model space Cube(P|^) 

Sec 

II-B3 


Mx|^ 

The punctured model space Cube^(P|^) 

Def 

III.8 


0|r 

Raw observation 

Sec 

II-B2 


S\t 

Recorded observation 

Sec 

II-B4 



Decision (action) following the observation 






Contents/parameters of a snapshot S 


Ks 

The complete graph on E with all aa* 

De 

fll.2 



edges removed 





#s 

State of the snapshot 

Def 

11.3 

a 


Wab 

Weight on the edge ab 

Def 

[1.3 

b) 

'^ah 

Learning threshold for the pair a, 6 E S 

Def 

11.3 

c 



Orientation cocycle of S 

Pro 

pll.7 


/i(a6) 

Dissimilarity measure of S 

Apf 

)B-C 



Objects derived from a snapshot S 


Dir(S) 

Derived poc graph 

Pro 

pll.8 


Poc(S) 

Derived weak poc set structure 

Def 

II. 10 



Weak poc sets and their dnals 



Poc sets (with and without indices) 

De 

f A 

1 


po 

The set dual of P, the 0-skeleton of 

Def 

A8 

b 



Cube(P) 





Dual(P) 

Dual graph of P, the 1-skeleton of 

Def 

A.8 

c) 


Cube(P) 





Cube(P) 

Dual cubing of the poc set P 

Def 

A.8 

a 


CubeX(P) 

The punctured dual Cube(P, p) with re- 

Def 

A.28 



spect to a realization p 





r 

The dual map f° : Q° —>■ P° of a poc 

Defs 3* 

V.2 

A. 

24 


morphism f : P ^ Q 






^Again, in the sense of M being contractible. 

^In the sense of homotopy type — El, chapter 0. 
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snapshots, their updating mechanisms and the derived weak 
poc set structures, and conclude the section with results on 
the learning capabilities of snapshots. Table [previews notation 
that will persist throughout the paper. 

A. Observation Model for DBAs 

1) Environment and State: We place an agent in an envi¬ 
ronment E. The state space of the system will be denoted by 
X, where we assume there is a map pos : X —)■ E producing 
the location pos (a:) of the agent in E, given the state a; S X of 
the system as a whole. As it turns out, no further mathematical 
structure on X is required for the results that follow, hence, 
with a mind toward inviting the broadest range of applications, 
we impose none, much in the spirit of McCarthy and Hayes 
discussion of situation calculus ia. 

2) Time and Transitions: We model time T as the set 
of integers (the subjective time of the agent), with t — 0 
corresponding to the initial time. The basic objects of study 
are then trajectories, or maps of the form 

( 1 ) 

We define abstract transitions in X as follows: 

Definition II.l (n-transitions). An element of the (n + l)-fold 
Cartesian power X"+^ will be referred to as an n-transition. 
For any trajectory : T —X and n > 0 we dehne the map 


r T 

—>• 

Xn+1 

1 ^ 


(ipL X • • • X (ip| 

~ \t — n ~ \ 


We refer to 0-transitions as states, and to 1-transitions simply 
as transitions. □ 

Any setting where E, X and the transition structure of the 
system are specified (though, possibly, in an implicit fashion), 
implies constraints on the set of achievable trajectories. We 
will refer to such settings as experiments, within the frame¬ 
work of which each allowed trajectory will be referred to as a 
run of the experiment, while observations produced by sensors 
during a run (below) will be called experiences. 

3) Discrete Binary Agents: A discrete binary agent (DBA) 
is endowed with a collection of binary sensors indexed by a 
hnite set E. We will assume that each a S E is assigned an 
order Ua > 1, and a realization p{a) C X"“+^. We then 
say that a is a Ua-sensor, or a sensor of degree Ua. For 
example, a 0-sensor - or state sensor - responds to the system 
entering a certain subset of X (a ’’macro-state”), while a 1- 
sensor responds to the system experiencing a transition of a 
particular kind. 

Evaluation of sensors is best viewed in the context of 
trajectories: a n-sensor a G E is applied to a trajectory ip 
and assigned a value at time f G T according to the rule 

{a-.p)\^ = l 4=^ G p{a) (3) 

Here {a ■. denotes the measurement provided by the 
sensor a at time t given the trajectory p. To avoid a profusion 
of parentheses and subscripts we will generally use bracket 


notation to denote the evaluation of Boolean- and scalar-valued 
functions: 

(a : x) := lp(a)(x) whenever a G E (4) 

{g : s) := g{s) whenever p : S' —>• [0,1] (5) 

and so forth. The symbol will always denote the indicator 
function of a set A with respect to the appropriate super-set. 

We will assume that E comes endowed with a map a i—t a* 
satisfying the following for all a G E: 

a** = a, a* a, ( 6 ) 

as well as 

p{a*) = p{ar ■■= X”“+i \ p{a) (7) 

We also introduce the virtual sensors 0, 0* G P evaluating to 

(0::^)|^ = 0, (0*:(^)|^ = 1 (8) 

on any trajectory p and at any time f G T. For subsets ACE 
we will always use the notation A* to denote the set of all a*, 
a ranging over A. 

The database structure we will be using is designed to 
maintain an approximate record of the relations among sensors 
in E believed by the agent to hold true throughout time. This 
record at time f G T is encoded in a weak poc set structure 
P|^ over E (definition 

For two n-sensors a,b G E this requirement translates 
into a < 6 in P|^ being treated (for planning purposes) as 
the inclusion p{a) C p[b) in the space X"+^. Note how 
the equivalent containment p{b*) C p{a*) is encoded by the 
contra-positive implication b* < a*, which, by the definition 
of a weak poc set, holds if and only if a < & does. 

When a, b have different orders we are forced to replace 
this requirement by a weaker one: at any time t, our agents 
will interpret a < 6 in P | ^ as 

{a : p) \^, < {b : p) \^, (9) 

holding for all t' G T. In other words, our agents assume that 
relations among sensors do not change over tim^ 

For example, if o, 6 G E where a is a state sensor and b is 
a transition sensor, consider the statements: 

(t) (a ■■^)\^<{b: p) 1^, it) {b ■■p)\^<{a: p) 

treated as identities over both p and t. We see that (|) states 
all transitions of type b must terminate in a state of type a, 
while (t) means that only transitions of type b could produce 
state a. It is clear that both kinds of statement are essential 
for planning. 

'SThis is not to say that our agents are not allowed to change their minds 
regarding which relations hold true and which do not: the purpose of keeping 
a dynamic record of relations is to eventually uncover the ‘correct’ relations. 


A.li. 
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B. The Model Spaces 


1) A Record of Implications: Informally, a “record of 
implications in S” is a partial ordering on S reflecting the 
standard interactions between Boolean complementation and 
Boolean implication. Formally, our DBA will maintain, at any 
time f > 0, a weak poc set structure on S consisting of 
a partial order relation < satisfying, for all a,b G S: 

1 ) 0 < a; 

2) a < b ^ b* < a*. 


Note that a* f a hy the construction of S and compare with 


the definition in appendix A.l 


2 ) Observations as vertices of a cube: From the agent’s 
viewpoint, the current state of the experiment at time t 
is completely characterized by the measurements (a ■ <p)\^, 
where ip is the agent’s trajectory. Equivalently, the state may 
be encoded in a subset 0|^ C S satisfying H {a, a*}| = 1 
for all a S E. Such subsets of E are called complete *- 
selections. An incomplete measurement of the state would then 
correspond to a subset O C E satisfying \0 n {a, a*}| < 1 for 
all a S E, called an (incomplete) ^-selection - see definition 
in appendir |A.6| along with remarks on the notation to follow. 

Thus, one thinks of the collection S’jE)'’ of all complete 
^-selections on E as enumerating the possible sensory equiv¬ 
alence classes in the sensed space. However, some of the 
elements in this collection are redundant given the record P|^; 
an implication a < b means that no O S ^(E)^ containing 
{a,b*} is expected by the agent to be witnessed by any 
observation (see fig|^. Formally, a set O C E is coherent 
(definition |A.7| i, if no pair of elements a,b G O satisfies 
a < b*. 


3) The model spaces: The model space M | ^ corresponding 
to the record Pj^ takes the form of a cubical complex — a 
topological space constructed from a collection of vertices (the 
0-skeleton), a set of edges (the 1-skeleton), and successively 
higher dimensional connecting cells in the form of cubes ll^ . 
We choose the vertex set of M|^ to coincide with the set of 
coherent ^-selections in 5'(E)°. Edges are inserted to join any 
pair of vertices A,B satisfying |A \ i3| = 1 (this condition 
turns out to be symmetric). The hop-distance on the resulting 
graph may be seen as a variant of the crude, ‘information 
motivated’, Hamming distance on {0,1}^. The 1-dimensional 
skeleton of M|^ is further enriched with higher dimensional 
cubes to yield the space Cube(P|^), as described in appendix 
[A| (definition |A.8| l for the interested reader. While a fairly 
detailed knowledge of the geometry and topology of spaces 
obtained in this way is essential for following our formal 
arguments regarding the modeling capabilities of this class, 
much of it is unnecessary for this section’s account of how 
the agent obtains its representation of M|^, the record P|^. 

4) Maintaining a record of the current state: Returning to 
the problem of representing the cmi'ent state, observe that P | ^ 
is expected to change as time progresses, possibly giving rise 
to observations 0\^ that are incoherent with respect to P|^, and 
therefore represent points ‘outside’ the model space. While 
the raw observation 0\^ must be applied to the agent’s data 
structure in hopes of improving P|^, the agent must resolve 
the contradiction within the framework of its current model. 


replacing the incoherent complete observation O | ^ in its role as 
the record of the current state kept by the agent with a coherent 
but incomplete observation ( |104| i, S\^ := coh(0 J, satisfying 
certain naturality requirements — see appendix B-E2 for the 
complete technical discussion. 

This means the agent resolves the contradiction at the price 
of introducing ambiguity into its record of the current state; 
instead of having a single vertex of M|^ representing the 
current state (“complete knowledge”), any vertex containing 
the set S\^ may turn out to be the correct cmi'ent state. 

The complexity of coherent projection (lemma III.4[ ) and its 
role in the agent’s reasoning processes, its interplay with the 
convexity theory of the model space M | ^ and its interpretation 
as the basis for viewing our architecture as a connectionist 
model (albeit a very limited one) of cognition will all be 
discussed in section ITlI-B I 


C. Snapshots 

In OOl we have introduced the rather loose notion of a 
snapshot, aiming to outline a class of database structures 
for dynamically maintaining weak poc-set structures from a 
sequence of observations made by an agent along a trajectory 
ip through X. A rigorous treatment of this tool requires some 
careful definitions. 

Definition II.2. Denote by the graph obtained from the 
complete graph over the vertex set E by removing all edges 
of the form aa*, a G T,. Edges of Ks will be referred to as 
proper pairs in E. We will abuse notation and write ab G Ke 
for the edge {a, b} of K^. □ 

The graph Ks is the scaffolding for snapshots: 

Definition II.3 (Snapshot). A snapshot S over E consists of 
the following; 

(a) State. Each vertex a € E of Ke is assigned a binary 
state #qS G {0,1}. The set 

#S = {aeE|#, = 1} (10) 

is called the state vector of the snapshot S and is required 
to be a ^-selection on E (definition |A.6| l. 

(b) Edge weights. Each edge ab G Ke is assigned a non¬ 
negative real number denoted Wab = Wab(S). 

(c) Learning Thresholds. Each edge ab G Ke carries a 
non-negative real number Tab = T'ah(S) satisfying 

't'ab ~ '^a*b ~ 'fab* ~ 'fa*b* — 4 ■ 

Eor every ab G Ke, the restriction of S to the subgraph 
induced by the vertices a,a*,b and b* will be denoted by 
S|af, and referred to as a square in S. □ 

The original motivation of for the notion of a snapshot 
is twofold; 

1) Maintaining a consistent representation of the cur¬ 
rent state. Eor this purpose we will generally assign 
the coherent projection of the current state measurement 
to be stored in #S. 

2) Learning implications in the sensorium. To learn 
an estimate of the implication order on E inherited 
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Wab* 

small 



D. Probabilistic Snapshots and Acyclicity 

The following set-theoretic identities among the coincidence 
indicators are easily verified for all a,b,c G E: 


= 0 


Cab = c: 


'"aa 

C'ab 

Cah* 


ba 

Cab 


‘-'ab* 


+ cl.b + c] 


a’*’ 5* 


-6c* 


^ab* 


= 1 


( 12 ) 


-6*c 


Fig. 1. determining edge orientations in a snapshot by restricting attention to 

SU- 


from its realization in X it should suffice to maintain a 
system of weights ru* ^ on S | ^ quantifying the relevance 
(e.g. frequency) of the event a A b, allowing one to 
partially orient the snapshot according to the rule of 
thumb illustrated in figure [T] 

The graphical representation derived from a snapshot in this 
manner does not automatically define a weak poc set, but is 
nearly there; 


These identities motivate considering snapshots with weights 
obeying analogous constraints: 

Definition II.6 (Probabilistic Snapshot). We say that a snap¬ 
shot S is probabilistic, if is a coherent ^-selection and the 
edge weights satisfy the following: 

. Consistency constraint, if ab, ac G Kg then; 

Wab+Wab- =Wac + Wac-> (13) 

• Normalization constraint, for any ab G K^: 

Wab + Wa*b + U)a-b- + Wab* = i (14) 

• Orientation constraint, if ab, be, ac G then: 

Wa*b + Wbv + Wc^a = Wab- + Wbc* + Wca> (15) 


Definition II.4 (poc graph). A poc graph T over E is a 
subgraph of Ks endowed with an orientation which satisfies, 
for every ab G Ks: 

- If a6 € r then b*a* G T; 

- If a6 € r then a*b, ba*, b*a, ab* ^ T. 

By abuse of notation, we use the symbol ab to mean the 
directed edge emanating from a and pointing to b (if it exists 
in T). □ 


We denote the set of all probabilistic snapshots over E by , 
or simply when there is no danger of confusion. □ 

A fundamental observation regarding probabilistic snap¬ 
shots is the following 

Proposition II.7 (Acyclicity Lemma). Suppose S is a proba¬ 
bilistic snapshot over E and T is a poc graph satisfying the 
orientation cocycle condition: 

ab gT ^ ui{ab) := Wa*b — Wab* > 0 (16) 


In order for a poc graph to represent a weak poc set structure 
on E one needs: 

Lemma II.5 (derived poc set). The transitive closure of the 
orientation relation on a poc graph T over E is a weak poc 
set structure on E if and only if T has no directed cycles. 

Proof: This follows directly from the discussion in ex¬ 
ample |A-A2| ■ 

The rest of this section mainly deals with characterizing a 
large class of snapshots encoding acyclic directed poc graphs 
and with means of evolving snapshot representations of X 
from trajectories. Given a trajectory p of our agent through 
X, the collection of coincidence indicators 

c\b-= {a-p)\t - {h.p)\^ ( 11 ) 


may be used to evolve a sequence of snapshots S|^ repre¬ 
senting, at any time t > 0, the cumulative influence of the 
agent’s observations on its perception of implications in the 
sensorium. 


Then T contains no directed cycles. 


Proof: See appendix B-A 


This proposition puts the vague notion from figure [^on how 
to derive implications from a snapshot on a firm footing: 


Proposition II.8 (Poc graphs from snapshots). Let S be a 
probabilistic snapshot. Construct a poc graph Dir(S) by 
setting 


ab G Dir(S) Wab* < min 
Then Dir(S) is an acyclic poc graph. 


'^abi 

^ab: ^a*6; ^a*6* 


(17) 

□ 


Proof: The symmetries of r. and w, immediately imply 
ab G Dir(S) iff b*a* G Dir(S'). The strict inequality in ( |T7| ) 
implies the second condition of a poc graph holds as well. 
Since the orientation cocycle is positive on every edge of 
Dir(S) by definition, the acyclicity lemma applies. ■ 

The element of thresholding present in ( [TtI i may also be 
used as a part of the updating procedure of a probabilistic 
snapshot, without affecting the derived poc set: 

Proposition II.9 (Snapshot Truncation). Let S be a proba¬ 
bilistic snapshot. Define a new snapshot [SJ to have the same 
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state as S while for every ah € K^: satisfying (HU the weights 


are updated as follows: 




1 

f Wab* 

H- 

0 



\ Wab 


Wab 4 

- Wab* 


1 Wa*b* 


Wa*b* 

+ Wab* 

1 

[ Wa-b 


Wa*b - 

- Wab* 


(18) 


Then Dir(S) = Dir([SJ). 

Proof: The proof amounts to a direct verification that the 
[SJ is probabilistic, and that ab G Ks satisfies ( [TtI i in [SJ if 
and only if the same condition is satisfied by ab in S. ■ 


Following lemma II.5 we may now safely define; 


Definition II.IO (Derived Poc Set). Let S be a probabilis¬ 
tic snapshot. Denote by Poc(S) the weak poc set structure 
obtained by setting a < 6 iff there exists a directed path in 
Dir(S) from a to b. □ 


We now proceed to introduce and study two possible 
snapshot constructions. 


E. Empirical Snapshots 

The empirical snapshot structure maintains an empirical 
approximation of the relative frequencies of observations of 
the form a A b, a,b G S. For any trajectory p of the agent 
through X we could try setting 

t 

(19) 

fc=l 

with S|^ a trivial snapshot for all t < 0. 

Definition II.ll. A snapshot S with w, = 0 is said to be 
trivial and denoted Null. □ 

Properties ( [TTl l then imply that S|^ satisfies the consistency 
and cocycle constraints (defn. |II.6| l for all f > 0, and would 
satisfy the normalization constraint if we replace the weights 
Kb by \Kb throughout. 

1) Construction and Properties: The formal construction is 
as follows; 

Definition 11.12 (Empirical Snapshot). A snapshot S over E is 
an empirical snapshot if the following conditions are satisfied; 

• For all ab G Ks, Wab G 

• For all ab G Ks, the expression 

Wa ■■= Wab + Wafc* (20) 

is independent of the choice of b, and vanishes only if 

#a = 0; 

• The following expression does not depend on a S S; 

Clk (S) ;= Wa + Wa* (21) 

Denote the set of empirical snapshots over E by (or just 
S' when justihed). □ 

The evolution of an empirical snapshot under a sequence of 
observations is then defined through; 


Definition 11.13 (Empirical Update). Let S be an empirical 
snapshot and let O C E be complete ^-selection. The snapshot 
O * S is the empirical snapshot obtained from S by setting 

Wab{0 * S) ;= Wa&(S) -I- {to ; a) ■ (Iq : b) (22) 

for all ab G K^. The state of O * S is set to coh(O) (where, 
recall, this is the the coherent projection ( | 104) 1 computed with 
respect to the weak poc set structure derived from the new 
weights). □ 

Definition 11.14 (Evolution). We say that a snapshot T over 
E is an evolution of a snapshot S, either if S = T or if there 
is a sequence {Ok)Ki complete ^-selections in E such that 
T = Ok * • • • * Oi * S. D 


Empirical snapshots are characterized by their ancestry; 

Lemma 11.15. A non-trivial snapshot S over E is empirical 
if and only if it is an evolution of the trivial snapshot. 

Proof: See appendix B-B| ■ 

Having characterized empirical snapshots as evolutions of 
the trivial one, we return to the observation that the weight 
w.(S)/Clk(S) on Ks — see ( [21] ) — defines a probabilistic 
snapshot. We may thus define Dir(S) accordingly, by setting 

Tab ■ Clk (S) , 

^abi 


ab G Dir(S) GA Wab* < min 


(23) 


and conclude that; 


Proposition 11.16 (empirical implies acyclic). If S is an 
empirical snapshot, then Dir(S) as defined in (|23|) is an 


acyclic poc graph, and Poc(S) as defined in defn. 11.10 
a weak poc set structure on E. 


We will henceforth refer to DBAs endowed with empirical 
snapshots and utilizing the empirical update as empirical 
agents. 

2) Performance of Empirical Agents: In this paper we 
restrict attention to agents endowed with a fixed finite set of 
actions. An agent starting out at time t — 0 with a trivial 
snapshot has no knowledge of its environment, and is 
therefore assumed to engage in random exploration for some 
time, until actionable information becomes available. This 
motivates the question as to how well the memory structure 
of an empirical agent performs in this initial stage. 

In the case where X is finite and the agent’s actions are 
deterministic it is easy to formulate this; Let E^^^ be the set 
of available actions, and consider the graph with vertex set X, 
where a vertex x is joined to a vertex y labeled by an action 
a G E^^j if applying a at a: results in y. Thus, X becomes 
endowed with the structure of a Markov chain, where we draw 
actions uniformly at random in every state. Eocus on the case 
when all the actions available to the agent are reversible in 
the sense that there is an edge from x to y if and only if 
there is an edge from y to x (loops are allowed as well). 
Then the corresponding Markov chain is a random walk and 
its stationary distribution over X, denoted by tt, is uniform 
ll60ll over each connected component of the resulting transition 
system. Thus, each normalized weight is nothing but 











the empirical estimate of the joint probability, given by tt, for 
a,b G S to bre synchronously. 

Restricting to a reachability component, we may assume X 
is connected. By abuse of notation, for ab G denote 

7r(ab) = 'K{p{a) n p{b)) (24) 

Let Dir°° denote the matrijj^with entries Dir^ = 1 whenever 
p{a) is contained in p{h) up to a precision of Tab, that is: 

Tr{p{ab*)) < min {Tab,'!^iab),TT{a*b*),n[a*b)) (25) 

and set Dir^ = 0 otherwise. This matrix represents the true 
poc set structure to be learned by the agent, as determined 
by the hxed learning thresholds. Analogously we let Dir* be 
the matrix with Dir^j, = 1 iff the directed edge ab G K^; 
is contained in the derived poc graph of Dir(S|J (and 0 
otherwise). A good measure of the agent’s performance would 
be the behavior of the total error 

Err{t) := ||Dir* — Dir°°||j^ (26) 

over time (the matrices viewed simply as vectors in of the 
appropriate dimension). By Theorem 5.1 in ll60ll . the agent’s 
random walk converges to tt at an exponential rate depending 
only on the transition structure of X determined by the actions 
E We conclude: 

Proposition 11.17. Suppose a DBA performs a random walk 
on a connected X, at each moment in time performing one 
of a fixed finite set of reversible deterministic actions. Then 
Err{t) converges to zero at an exponential rate. 


With such strong performance guarantees for a broad class 
of empirical agents we are left to examine the variation in 
performance as a function of the geometry/topology of the 
environment (beyond the guarantees given by the preceding 
discussion) we have run simulations in the following settings: 
(a) The agent performs a random walk along a path with 
20 edges (example A- DT] i, choosing between one step 
forward and one step back uniformly at random for 
every f S T, learning the poc structure of a sensorium 
consisting of 20 ‘GPS’ sensors, as described in example 

IMD 


(b) The agent performs a random walk along a cycle with 
20 edges, choosing between a clockwise and a counter¬ 
clockwise step uniformly at random for every t G T, and 
learning a sensorium consisting of 20 beacon sensors as 
described in example |A-D3[ 

(c) The agent performs a random walk (up/down/left/right) 
on a square grid with 10 ‘GPS’ sensors along each of the 
X— and y— axes; 

(d) The agent performs a random walk along (forward/back) 
a path with 20 edges, but the sensors are chosen to have 
random activation helds (randomly chosen subsets of the 
set of vertices along the path); the sensor fields have been 
drawn anew prior to each separate run. 

The number of sensors is the same for each setting, and each 
agent carries out 50 runs of a length that is cubic in the 


^Recall that Dir(S) introduced in Prop. II.8 is a directed graph. The new 


notation is intended to connote a matrix representation of such a graph. 



time 


(b) Circle w/Beacons 



(d) Interval w/Random Sensors 



Fig. 2. Logarithmic plot of the mean number of incorrect edges in 
the derived poc graph of an empirical snapshot (20 sensors), for 
learning thresholds varying linearly between j (cyan/light) and ^ 
(blue/dark), averaged over 50 runs of random walks each. 


number of sensors, starting at a random position with an empty 
snapshot. We have tested 10 different agents for each setting, 
corresponding to 10 different values of the learning threshold, 
spread linearly in the interval from 1/(20)^ to the maximal 
meaningful value of 1/4 (where one should not expect much 
useful learning to occur). 

The results are summarized in hgure plotting Err{t), 
where we have replaced the matrix Dir°° as dehned in ( |25| ) 
by the {0, l}-valued matrix 

Dir^ = 1 ^ p{a) C p{b) (27) 


to render the effect of choosing different values for the learning 
threshold more visible in the graph of Err{t). 

The resulting plots show signihcant, though subtle, differ¬ 
ences in performance between the four settings, illustrating 
the similarities and differences in the weak poc set structures 
being approximated, most notably: 

- The sharper initial decline in the mean deviation for 
(b) and (c) in comparison to (a) is expected due to the 
relative abundance of crossing in the former, as opposed 
to complete nesting (see dehnition A.22 1 in the latter. 

- Performance in the random setting (d) seems to lag 
signihcantly behind performance in any of the structured 
settings. 

- Performance in the completely nested setting (a) seems 
to provide exponentially fast learning no matter what; by 
contrast, the other settings seem to experience a transition 
between two modes, depending on how small the learning 
threshold r is: 


1) For large r, the deviation plateaus. 

2) For small enough r, the deviation decreases to zero 
in hnite time. 

We expect the critical value of r in any hnite setting to 
be somewhere around the minimum probability of a state 
(under the stationary distribution of the random walk): 
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in order for a relation a < 5 to be put on record, it is 
necessary for the agent to have visited p{a) n p{h*) at 
a frequency below r; the smaller t is, the fewer false 
relations will be recorded for posterity. 

Recalling that the poc set representing the ground truth 
in (c) is the direct sum (see |A-D2[ ) of two smaller 
copies of (a) having 10 sensors each, we see that the 
crossing relations between the x-axis sensors and their 
y-axis counterparts account for 800 of the 1600 entries 
(two 20 x 20 null sub-matrices) in the adjacency matrix 
of the derived graph. Thus, the two experiments are not 
that different; loosely speaking, the 10 x 10 square grid 
experiment projects onto a Cartesian product of two 10- 
path experiments where the random walk on the 10 -path 
becomes a lazy walk with probability ^ to stay put. In 
other words, the behavior of (c) may be inferred from the 
behavior of (a). 

Not so when comparing (a) and (c) with (b); note how 
the sub-critical values of the learning parameters in (a),(c) 
and (d) force the deviation plot to ‘plunge’ into the x- 
axis versus the horizontal asymptote behavior of (b). In 
view of theorem A.331, our guess is that the environment 
(the circle) not being contractible has something to do 
with this qualitative change in behavior, but this requires 
further investigation. 


F. Discounted Snapshots 

A notable weakness of empirical snapshots as a data 
structure is their potential high cost in space, due to the 
need for indefinitely maintaining integer-valued counters. In 
some sense, the entire history of the agent matters, and, in 
some sense, matters too much. This motivates the search for 
an alternative, more quantized, updating mechanism whose 
dependence on any given past observation weakens at a fixed 
rate. 

I) Construction and Properties: 

Definition 11.18. (discounted update) Let q G [0,1] and let 
S be a probabilistic snapshot over E. For any complete *- 
selection O on E we define the snapshot O S to be the 
snapshot with weights determined by 

WabiO *q S) := qWabiS) -I- (1 - 9) {to : a) ■ (Iq : h) (28) 

The state of O S is set to coh(O), the reduction being 
computed with respect to the weak poc set structure derived 
from the new weights. Finally, define the q-discounted update 
of S to be the snapshot [O SJ and we refer to q as the 
decay parameter. □ 

A significant advantage of the discounted update is its 
applicability to arbitrary probabilistic snapshots: 

Lemma 11.19. The q-discounted update of a probabilistic 
snapshot by a complete ^-selection is probabilistic. □ 

Proof: It is clear that the discounted update preserves 
probabilisticity. Proposition |II.9| finishes the proof. ■ 

Consider the length of time (or the amount of evidence) 
it takes a discounted snapshot to acquire an implication. 


compared to the amount of evidence required for giving up 
an implication already on record. 

Assuming a fixed value of the decay parameter q over a 
considerable length of time, a lower bound on the amount 
of time At required for Wab-^ to become small enough for a 
relation a < & to be put on record is given by the situation 
when a long enough sequence of consecutive observations with 
a Ab* not occurring is made: 

q <TabAA At> - - (29) 

log 2 g 

On the other hand, once the relation a < b has been put on 
record, the number At of successive observations of a A b* 
required for replacing this relation with a fh 6 must satisfy: 

Af(l - q) > Tab At> (30) 

1-q 

- this much is guaranteed by the truncation mechanism. 
Overall, it seems that choosing a value of q with 1 — <7 
sufficiently small should produce meaningful learning: lower 
values of Tab make it both harder to learn and easier to unlearn 
a false relation, while maintaining a qualitative difference 
between the necessary requirements for either process. 

Keeping q fixed over long periods of time places an em¬ 
phasis on the values of the learning thresholds Tat. As these 
values do not have to be chosen uniformly over the snapshot, 
one might want to vary the values of the learning thresholds 
individually with the aim of altering the flexibility of the 
learning process in the corresponding square. This opens up 
a doorway to employing methods for varying the learning 
thresholds and the decay parameter in ways analogous to ESI 
and II 22 I as a means of improving the quality/dependability 
of the model space. The simulation results below emphasize 
the need for this kind of control, showing that a discounted 
agent is much more susceptible to changes in geometry and 
topology/combinatorics of the sensor fields than an empirical 
one. 


2) Performance Analysis: Figure compares the mean 
performance of time-discounted snapshot learning from a 
random walk in the four settings described earlier in II-E2 for 
the values of the decay parameter q given hy q = 1 — 
fc = {0,...,9}. 

One immediately notices, in comparison with the empirical 
case, that the dependence of the learning process on the 
discount parameter is not monotone: it would seem that a 
choice of fc = 5 works best for all settings in terms of 
optimizing the eventual deviation, — though it is hard to say 
what ‘best’ would even mean for (d) — while a choice of 
fc = 4 is more reasonable given the observed waiting time until 
meaningful learning occurs in the structured environments (a)- 
(c). 


Similar observations to those made for the empirical case 
(figure 1 ^ regarding the interplay between ‘learning modes’ 
and geometry/topology can be made here as well, but are more 
subtle, as the comparison in figure shows. 
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(b) Circle w/Beacons 



(d) Interval w/Random Sensors 





time 


following extension of Dir(S) turns out to serve our purposes 
for a restricted class of probabilistic snapshots: 

Definition 11.20. Let S be a probabilistic snapshot. The poc 
graph Dir(S)o is dehned to be the poc graph obtained from 
Dir(S) by adding the directed edges ab,ba,a*b*,b*a* for 
each ab G Ks satisfying ( [3T] i. □ 

It turns out that Dir(S)o gives rise to an adequate weak 
poc set structure and model space, provided S satishes the 
additional requirement: 

Definition 11.21. A snapshot S is said to satisfy the triangle 
inequality, if 


Wa^h + Wafc* + Wb*c + Wbc* > Wa*c + tt'ac* (32) 
holds whenever ab, be, ac G K^;. □ 


Fig. 3. Mean number of incorrect edges in the derived poc graph of a 
discounted snapshot in 4 environments (20 sensors each) for varying 
values of the decay parameter, g — 1 — , k from 0 (red/dark) to 

9 (yellow/light), averaged over 50 runs of a random walk. 


A class of examples of special signihcance in this work is 
that of snapshots S whose edge weights are derived from a 
measure p, on a space Z by pulling back along a realization 
p : S —?■ Z as follows: 


titab = M (p(a) n p(b)) (33) 




(b) Circle w/Beacons 



(d) Interval w/Random Sensors 



Fig. 4. A comparison of the mean number of incorrect edges in the 
derived poc graph as a function of time, for an empirical snapshot 
(blue) and a discounted one (red). Here r = 1/20^ and g = 1 — 


G. Further Adjustments to the Weak Poc Structure 

The implication record constructed from a probabilistic 
snapshot in the preceding section does not recognize possible 
equivalences among sensations: if, for whatever reason, a 
relation of the form 


The triangle inequality for S is then an immediate consequence 
of the well-known (e.g. ED, chapter 3) triangle inequality for 
measures: 


p{A A C) < p{A AB)+ p{B A C), (34) 

where A, B,C C Z are arbitrary measurable sets and A A B 
denotes the symmetric difference {A\ B)U {B \ A). 

The coincidence indicators c^f, of ( [TT| l are a special case 
of this example (where p is an atomic measure), and so are 
empirical snapshots (as their weights are sums of coincidence 
indicators). Discounted snapshots fall into this class, too, as 
their weights are convex combinations of coincidence indica¬ 
tors. 

Due to the technical nature of the interactions between 
Dir(S) and the extension Dir(S)o, we postpone the formal 
discussion of these interactions to appendix |B-C| The bottom 
line, however, is that for any probabilistic snapshot structure 
satisfying the triangle inequality our agent may safely apply 
the control protocols of the next section to the extended poc 
graph derived from the agent’s current snapshot to arrive 
at action choices while taking advantage of the perceived 
equivalences within the sensorium. 

Although technically we are obliged to distinguish between 
Dir(S) and Dir(S)o, as well as between the weak poc set 
structures they correspond to, we will treat these objects as 
identical for the sake of simplifying the rest of the exposition. 


Wa&* = Wa*6 = 0 (31) 

takes place in a snapshot S, it becomes reasonable to interpret 
it as the logical equivalence a 44 6, yet Dir(S) will not register 
any relations in the square S|ab, barring an agent equipped 
with S from utilizing the currently observed equivalence. 

Thus, an adjustment of Dir(S) is required if we are to allow 
our agents the advantage of reasoning about equivalences. The 


III. Control with Snapshots 

This section introduces the basic control function of a 
snapshot. We begin with introducing a formalism designed 
to treat discrete actions as a sub-structure of the binary 
sensorium, and discuss the effect of this formalism on shaping 
the model space piI-A| i. We next turn to a discussion of the 
snapshot S|^ as a highly efficient computational mechanism 
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for coherent state-updating and for deci sion m aking based on 
the geometry of the model space M|^ (III-Bi. 

At the technical level, this section requires an understanding 
of the convexity theory of cubings: The classical results are 
covered in appendix |A-D| while our new technical results 
underlying the use of snapshots for greedy navigation in 
cubings are covered in appendix |B-E| 

Building on these results, section |111-B2| introduces the 
mechanism of signal propagation over a snapshot which real¬ 
izes the computation both of coherent projection and of closest 
point projections to prescribed convex subsets of the associated 
model space. This mechanism suggests a view of snapshot 
architecture as highly simplified connectionist architectures, 
and some related work in the literature is discussed. 

At the heart of our proposed decision-making algorithm is 
an assumption that the sensorium is rich enough to detect 
direct causal relations between actions and other sensations. 
We provide a fairly broad formalization of this assumption in 


11TB3 (with an example in IIl-B4i, and prove the ability of 


an agent to correctly ‘halucinate’ the immediate consequences 
of taking an action, provided sufficient exposure. 

An algorithm using this tool to attempt greedy na vigation 
over Ml to a specified target state is proposed in III-B5 


and some of its failure modes are discussed as a motivation 
for future research on judicious dynamical expansions of the 
sensorium which would allow the agent to overcome the 
navigational obstructions formed by states in M|^ having no 
witness in the situation space X. 

Finally, in II1-C3 we explore the performance of some 
excitation-driven DBAs: agents endowed with an excitation 
level that changes depending on their distance from a target 
in the environment E; the agents are capable of sensing 
an increase or a decrease in excitation, and seek instant 
gratification in the sense of operating on the mandate to 
always pick an action guaranteeing an increase in excitation 
(or else act randomly). We compare the performance of such 
agents in the domains considered in II-E2 and II-F2[ in these 
domains it is easy to guarantee arrival to the target given a- 
priori knowledge of the correct snapshot structure, but we are 
interested in the agent’s performance as they learn the problem 
’’from scratch”. 


A. Defining Actions 

We will now restrict attention to DBAs with a sensorium 
E endowed only with state (degree 0) and transition (degree 
1) sensors. As before, we denote the realization of a sensor 
a S E by p{a), where p{a) C X for a state sensor and 
p{a) C X X X for a transition sensor. Thus, state sensors and 
transition sensors may be viewed as Boolean and situational 
fluents over the situation space X, which is sufficient for 
setting up a discussion of actions and competencies according 
to McCarthy and Hayes Il47l . 

For our agents, we posit a set E^^^ C E of transition sensors, 
each of which may be switched on and off at will, earning 
them the name of actions. To be precise, our requirements 
are; 

• Actions are binary. We assume E^^^ n E*^^ = 0, and 
we denote the poc subset E^^^ U E*^^ U {0, 0*} by Act. 


• Every action has outcomes. For any a G E^^^ and 

a: € X, the sets 

a{x) = {y Gli-lx X y G p{a)} (35) 

are non-empty subsets of X. 

In this we depart slightly from the accepted notion of actions 
in the literature on transition systems of various flavors (e.g. 

where actions are attached to states and the collec¬ 
tion of actions available at each state may differ, depending 
on that state. Instead, we consider actions as nothing more 
than control signals, sent by the agent’s ‘mind’ to the agent’s 
‘body’ in order to invoke (or not) one or more of a fixed set of 
available behaviors. It is the purpose of the ‘mind’ to identify 
whether or not a control signal produces meaningful outcomes 
as those outcomes are being sensed. 

1) Invoking Actions Synchronously: Our sensor-centric ap¬ 
proach to actions reflects the viewpoint that (1) an action 
a G E^^j taken at a state x G X imposes a time-independent 
restriction on the set of states the system may enter in the 
following moment, and (2) the agent is capable — at least in 
principle — of observing its own decisions as they are being 
invoked. We must now discuss the precise extent to which 
these principles may or may not restrict our initial suggestion 
that the sensations in Act are controllable. 

For example, consider the situation where the agent is not 
engaging in an action a G E^^^ during a transition from state x 
to state y. This implies a* is on during this transition, which 
restricts the possible values of y to X \ a{x). Hence, not 
invoking any of the available a G E^^j must then restrict 
y to the intersection PlaGS ^ '^he set of possible 

outcomes of the ”no-action’^. 

More generally, allowing a number of actions to be taken 
at the same time (while not engaging in the rest) forces the 
following interpretation by our sensing model; 

1) A generalized action by the agent is a complete *- 
selection A on Act (recall definition |A.6| i; 

2) The realization of a (generalized) action A G 5'(Act)° 
is defined to be 


piA) = Pi p{a) , (36) 

aeA 

or, equivalently, for every x G X, the set of possible 
outcomes of an action A equals 

= n «(a;) (37) 

aeA 

For this extended collection of actions one notices that the 
second requirement of an action — A(x) ^ 0 for all x G X — 
does not necessarily hold; for example, moving forward along 
a rail contradicts any motion in the opposite direction. We 
will say that a generalized action A G ^jActj^ is admissible 
at X G X \f A{x) ^ 0, and that A is admissible, if it is 
admissible at x for all x G X. 

Aside from setting natural bounds on the meaning of the 
initial statement that actions are available to the agent at will, 
the notion of admissibility of a generalized action explains 
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how to interpret the poc set structure induced on Act from 
the realization p: if 


a < (3 ^ p{a) C p{/3) (38) 

happens to hold for P|^, then every generalized action admis¬ 
sible at a point a; € X defines a vertex of Cube (Act) no 
matter the choice of x G X. Similarly, generalized actions not 
showing up as vertices Cube (Act |J, where Act|^ denotes the 
restriction of the poc set structure P|( to Act, represent the 
agent’s belief at time t regarding combinations of elementary 
actions it cannot achieve at that moment. 

In the simple examples considered in this paper all agents 
will be endowed with a collection of mutually exclusive atomic 
actions. By this we mean that a < P* holds for all a,/3 G 
(a /3). Equivalently, only the ”no-action”, {a* \a G }, 
and the ’’pure” actions {a}U{/3* |/3 G \ {a} } are admis¬ 
sible, and the resulting cubing Cube (Act) takes the form of 
a starfish: a tree with only one vertex of degree> 2 given 
by the ”no-action” and with a set of leaves in one-to-one 
correspondence with the set of ’’pure” actions (see example 
and figure 


A-Dl 


3) Example: discrete path with motion: To illustrate the 


description of the model space provided by proposition III.l 


consider an agent moving in steps of unit length along a path 
of integer length L > 1. Formally, the environment is given 
by E = {0 ,... ,L} and the agent has the actions defined by: 


y G wait(x) 


pos(y) 

y G fwd(x) 


pos(y) 

y G bck(x) 


pos(y) 


nin {L, pos(x) + 1} (42) 

pos(?/) = max{ 0 ,pos(x) — 1} 

enabling motion from any vertex /c G E to the adjacent k + 1 
and k — 1, when they exist. We also endow the agent with 
sensors oi,..., G E realized as: 

(ofc : x) = 1 pos(x) < k (43) 

Up to symmetry, the only relations holding in the existing 
scheme are 

ai < 02 < ... < ol , (44) 

the ‘starfish’ relations for Act: 

fwd < bck* , bck < wait* , wait < fwd* , (45) 

for the actions {fwd, bck, wait}, and the two relations 


2) Observations: The following set is the set of observa¬ 
tions in P (note that it is closed under the ^-operator): 

Obs := (E\ Act) U {0,0*1, (39) 


and stands for t he set of ’’ passive” sensations, as opposed to 
actions. Sections II-C|lTG explain how a trajectory 
may be used to form an evolving sequence of weak poc-set 
structures (P|()t>o over E, with each P|^ representing the 
agent’s belief at time t regarding which implications among 
the sensors in E hold true throughout time. Two poc subsets 
of P 


are formed by restricting its poc structure: 
^ is the induced poc structure on Act; 


t 

Act I 
Obs 


is the induced poc structure on Obs. 


We are interested in the interaction between these smaller 
poc sets and the full model space, Cube(P|J. One has two 
surjections 


projact : Cube(P| J Cube (Act 

projobs '■ Cube(P| J -G Cube(Obs|^) 


defined, at the level of 0 -skeleta, as follows: projact sends a 
coherent ^-selection A on P|^ to the ^-selection An Act, and 
similarly for pro jobs- Hence, at the level of 0-skeleta, there is 
a map: 


Cube(P| J Cube(Act| J x Cube(Obs| J (41) 


fwd < a{ , bck < ql , (46) 

indicating pos(x) = 0 may not be reached by applying fwd, 
while pos(x) = L may not be reached by applying bck. 
No other relations hold universally. Let P denote the poc set 
structure over E recording these relations. 




Fig. 5. Model space for a DBA placed in a discrete path, depicted 
together with its projections to Cube(Act) (right) and to Cube(Obs) 
(below). This is the case L = 5 of example |III- A3 1 


Cube(P) is the result of forming the Cartesian product of 
a 3-proi^ed starfish Cube (Act) with the path of length L 
obtainednas Cube(Obs), and then removing two squares as 
shown in figure due to the relation in ( |46] l. 


In fact, Sageev-Roller duality ED implies a much more 
precise statement: 


Proposition III.l. The map above is a median¬ 

preserving embedding o/Cube(P|j) in the Cartesian product 
Cube(Act|^) X Cube(Obs|J. 


Proof: See proof in appendix B-D 


B. Reactive Planning 

1) Statement of the planning problem: In this section we 
consider a DBA at time t > 0, equipped with a snapshot S|^ 
with a derived poc graph r|^ = Dir(S|^) and associated weak 
poc set P|^ (but keep in mind the notational simplifications at 
the end of|II-Gl. The agent’s tasks at hand are: 


®Compaie with example A-Dl and figure [TT| 
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(Tl) Predict the immediate outcome of any a G A.ct|^ 
(or, more generally, of any A G Cube (Act |j); 

(T2) Given a set T C E of target sensations to be achieved 
jointly, decide on a (generalized) action for the 
agent to invoke in the next transition. 

Both tasks need to be achieved based on the agent’s record 
of the current state, S\^ = #S|^, which is a coherent 
(though not necessarily complete) ^-selection on P|^ obtained 
from the complete ^-selection 0\^ representing the agent’s 
raw observation of the current state at time t, by coherent 
projection ( |104| l. We will keep all the above notation fixed 
through the rest of this section. 

It is crucial that we interpret these tasks in terms of the 
model space M|^ = Cube(P|^). For any subset B C T,, dehne 
the set: 


f}{B) := {V C E |P is a vertex of M|^ containing B } 

(47) 

These are known to be precisely the c onvex subsets of the 
1-skeleton of M|^ (see appendix A-Di. Thus, the agent is 
assigned the problem of reaching the convex set \){T) from a 
(possibly unknown) position in the convex set 1)(5'U. 


2) Signal Propagation over a Snapshot: The purpose of 
r ^ is to serve as an inference tool for the agent. Recall that 
r j is formed from the weight structure of the snapshot S|^, 
which, in turn, is a result of updating the weights on 
with the raw observation 0\^. The last step of the update is 
the ‘loading’ of r|^ with the current state >S'|^ = coh(0| J of 
the agent. 

Definition III.2. Let i? C E. Denote by [r|^, B] the weighted 
graph obtained from r|^ by attaching the Boolean weight 
(Ifi : v) to each vertex u G E, and refer to it as r|^ being 
loaded with B. □ 


Definition III.3. A propagation algorithm along r|j is any 
algorithm which, for any coherent load i? C E and any T C E 
accepts [r I ^, S] and T as input and produces as its output the 
loaded graph [r|^, i?] where a G i? if and only if: 

1) there is a directed path in Pj^ from i? U T to a, or - 

2) there is no directed path in Pj^ from a into T*. 

The set i? C E is said to be the result of propagating the 
signal T along [P|j,i7]. □ 


Applying the convexity theory of the model space M|^ — 
specihcally corollary B.16 and proposition B.4 — we hnd the 
following applications for propagation: 


Lemma III.4 (Implementing the State Update). For any prop¬ 
agation algorithm, propagating the signal along [P|^,0] 
produces = coh(0| J. □ 

Lemma III.5 (Reasoning in Snapshots). Let T d T, be any 
set. For any propagation algorithm, propagating the signal T 
along [P| produces the projection in M|^ of the current 

state ()(5'y onto the reduced target f)(coh(r)) C My 

The first lemma explains how to implement the snapshot 
update, given a propagation algorithm: 


Algorithm 1 A simple implementation of propagation of a 
signal T over a poc graph P loaded with B, based on depth- 
hrst search._ 

function main(P, B, T) 
visited G- 0 
U ^closure(P,T) 
return (B U (7) \ U* 
end function 
function closure(P,T) 
for all a G T do 
EXPLORE(P, a) 
end for 

return visited 
end function 

procedure explore(P, v) > Recursive step 

visited G- visitedU{u} 
for all w GCHILDREN(P, u)wisited do 
explore(P, w) 

end for 
end procedure 

function children(P, u) > Children of u in P 

return {w G E \vw G P} 

end function 


\> Propagating T over [P, B] 
> A global variable 


> Forward closure of T in P 


1 ) use the raw observation 0\^ and to recalculate the 

edge weights for Sy 

2 ) compute the derived graph Py 

3) propagate the signal 0|^ over [Py 0] to compute S\^ = 

coh(Oy. 

The second lemma is the key tool for turning a propagation 
algorithm into the planning algorithms we discuss in the rest 
of this section. 

In practice, one can implement propagation using a variant 
of depth-first search (DFS) on P|^ ll63i . while maintaining an 
expanding record of vertices visited — see algorithm 1. This 
algorithm clearly has time complexity that is at most quadratic 
in the number of sensors, and we conclude: 

Corollary III.6 (Quadratic Snapshot Maintenance). Both the 
time and space complexity of updating the snapshot with 
an observation 0\^ to form S\^ are at most quadratic in |E|. 
□ 


A far more efficient implementation is possible provided 
sufficient parallel processing power, by realizing propa gation 
directly on PI in a distributed fashion, using corollary B.16 


given [Py^y and a target T one first follows all directed 
paths in r|j emanating from {S\^ UT) loading the traversed 
vertices with 1, and then follows all reverse paths emanating 
from T* and loads their vertices with zeros. Implementing this 
algorithm in practice is problematic for large | E | in view of the 
high plasticity of the graph P|^ and the potentially prohibitive 
requirement for the DBA to maintain up to 0(|E|) processes, 
all active at the same time. Despite its current impracticality, 
such an implementation seems evocative of the notion of 
neuronal networks. We discuss this tentative connection in 
section lEll 
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3) Algorithm: Computing the consequences of an action: 
Planning of any kind requires an ability to sense the context 
of an action. This ability may be imparted to the agent by 
introducing sensors of the form 

{a As : (p)\^ = {a : (48) 

where a is an action and s G E is any sensor. The idea behind 
constructing a A s in this way is for snapshots to be able to 
detect implications of the form ’’invoking a when s is on leads 
to s'” simply as directed paths from a A s to s' in the derived 
graph. From a formal point of view, allowing this kind of 
sensor requires the observability of the snapshot in X, that 
is: in order for the values of sensors in E to be allowed as 
input to (possibly other) sensors in E, it is necessary by our 
formalism for X to carry information regarding these values 
in its state. 

The problem of constructing a judicious process of enrich¬ 
ing the sensorium with an effective collection of introspective 
sensors is set aside for future research. Instead, in the present 
model and simulations used to illustrate these ideas for the 
purposes of this paper we have committed to a sensorium 
containing an over-abundance of such sensors: 

• Position Sensors. We assume the environment E is 
given as the union of a collection of its subsets, the 
agent being given a state sensor loc[?7] for each U G 'W 
satisfying (loc[t/] : x) = ( 1(7 : pos(a:)). 

• Actions. A collection of actions (in the form of 1- 
sensors) is provided. 

• Contextualized actions. For each U GfA and a G Act 

the agent is given the sensors aAloc[t/] and aAloc[C/]*. 
Under these assumptions, the following result yields a mech¬ 
anism allowing the agent to ‘hallucinate’ the broadest conse¬ 
quences of an action within the context of its current model 
space M|^: 

Corollary III.7 (Computing the Consequences of an Action). 
For any generalized action A C Act, the result of applying A 
in the transition from time t > 0 to time t + 1 is the result of 
propagating the collection {a A loc[t/]}^ [(jIgsI 

|r|,.s|,l- " ' n 

Thus, propagation provides a provably correct and compu¬ 
tationally efficient mechanism for predicting the immediate 
outcomes of an action, provided a sensorium of the above 
form and a snapshot faithfully recording the nesting relations 
among the sensors. 

Combined with the results of section demonstrating that 
learning the correct relations within a fixed sensorium with a 
high degree of fidelity is possible even for an agent performing 
a random walk, the last corollary suggests that effective 
(and efficient) planning and closed loop control — bundled 
together with life-long learning features — are entirely feasible 
for DBAs carrying a snapshot architecture. We discuss both 
problems in the following paragraphs. 


4) Example: discrete path with motion, revisited: To il¬ 
lustrate the above, we continue example III-A3 Recalling 
E = {0,..., L} we see that the sensors Ok defined in (|43l) 


fwd''a2 < a*3 



Fig. 6. Model space for an agent on a discrete path, with two added 
contextual action sensors. 


may be rewritten as: 


afc = loc[C/fc], C/fe = {i G E |0 < i < fc} (49) 

Thus, for example, adjoining the two sensors fwd A and 

bck A 04 to E implies the relations 

fwd A 02 < Og , bck A 04 < 03 , (50) 

whose effect on Cube(P), once they are learned by the agent, 
is shown in figure 

Further expanding E to include all the sensors 

fwdAOfc, fc = l,...,L-l 
bck A Ofc , k = 2,..., L 

turns Cube(P) into the complex illustrated in figure |7] As 
shown in the figure, the order structure on P encodes both 
large-scale geometry (the agent may use propagation to con¬ 
clude ”in order to reach 11 ( 05 ), I need to to reach t)(o 2 )”), and 
the actions required to negotiate this geometry (”I know that 
fwd A a* implies 02 , and I am currently in l)(a*)”). 



Intermediate sensations: 

as <a*4 <a*3 <a*2 


Current state: 
position 1 


Goal: "arrive at position 5" i 


Fig. 7. Model space for an agent on a discrete path, enriched with a 
full complement of contextualized action sensors and illustrating 
the geometry underlying planning by propagation |III-B5] 
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5) Algorithm: Greedy Reactive Planning (GRP) of Motion 
Towards a Specified Target: The ability to compute the imme¬ 
diate consequences of any available action and the convexity 
theory of M|^ underlie the following greedy algorithm used 
to decide on an action to be taken for the purpose of achieving 
a long-term goal: 


1 ) 


2 ) 


3) 


Given a set T of target sensations, propagating T over 
[r|^, S'! J yields a list R of sensations characterizing the 
projection of the region f)(5'U representing current state 
in M|^ to the desired region [)(T). 

Each of the elements of R may be considered as a sub¬ 
goal, and a generalized action guaranteed to achieve as 
many of these subgoals as possible may be selected 
based on the corollary III.7[ any ties are broken arbi¬ 
trarily. 

Once an action is invoked, the same target T is presented 
to the agent for an additional iteration of this procedure, 
until completion. 


a€ A 


type of a point and the covering 'W of E by location place 
fields satisfies the richness requirements placed on it by that 
theorem. We consider two of examples of this kind. 

1) Example: A Punctured Grid: Consider the example of 
an agent navigating an N x N square grid (realized as a 
subset Gn of the integer grid) and equipped with a collection 
of position sensors identical to that of II-E2( d) and II-E2 d). 
Denoting pos(x) := ^ x j] G h x Z we have the sensors 


= {ji : x) = 1 ^ r] < i , 


(53) 


By lemma III.5 this algorithm is directly analogous to motion 


planning in the Euclidean plane in the absence of obstacles: 
the agent selects an action which, to the best of its knowledge, 
best approximates the greedy path towards the closest point of 
the indicated target. The next section will consider the kinds 
of problems arising in the presence of obstacles in the model 
space and some early numerical study undertaken to explore 
overcoming some of these problems. 

C. Obstructions to Greedy Reactive Planning. 

Where do obstructions to GRP in M|^ come from? Recall 
that every transition x x y G X x X capable of occurring in 
the given experiment determines a complete ^-selection A on 
E, by our observation model, via: 


for i G {1,...,A^ — 1}. This time, however, suppose that 
one interior vertex vq of the grid has been removed, so that 
E = Gn X. {vq}. As in the above simulations, we assume 
the agent is equipped with actions labelled up, down, left and 
right whose effect at each vertex is to move to the appropriate 
adjacent vertex of the integer grid if that vertex belongs to E, 
and to remain in place otherwise. Suppose, for simplicity, that 
the snapshot structure for this agent is empirical. 

Eor N sufficiently large, the statistical nature of the learning 
algorithm will cause the agent to learn the same weak poc set 
structure as in the case of vq being present: implications of 
the form 


up A y* < , up A x,; < xi, x^ < x,;+i 


(54) 


y G p{a) if a is a state sensor 
X X y G p{a) if a is a transition sensor 

(52) 

Thus, although M|^ does provide a universal model space for 
any realization of the weak poc set structure P|^, the agent 
is only capable of witnessing *-selections of the above form, 
no matter the choice of action. This observation motivates the 
following definition: 

Definition III.8 (Punctured Model Space). By the punctured 
model space at time t we mean the sub-complex of 

M|^ inducecp|by the set of vertices of M|^ of the form ( |52l ) 
(compare with the discussion in appendix |A-E3| l. 

Thus, in addition to the possibility that an agent will have a 
false implication on record (causing some sensory equivalence 
classes to be deemed incoherent until they are sufficiently 
sampled), it is also possible that M|^ contains obstacles to 
GRP in the form of vertices in M|^ \ 0. In fact, the 

presence of obstacles of this kind is guaranteed by the main 
result of lf30ll — also reviewed in appendix A-E4| Theorem 


and their respective variations will be learned upon sufficient 
exposure, giving rise to the same poc set structure as the one 
representing the complete grid Gn- One could view this as 
a manifestation of the fact that our model spaces are always 
contractible (corollary |A. 31 1 . 

As a consequence, the agent is bound to attempt moving 
to the unavailable vertex vq at any time t when its position, 
p*, is adjacent to vq and vq belongs to a shortest path in Gat 
joining p‘ with the prescribed target. In such a situation, the 
agent is guaranteed to attempt motion in the direction of uq 
and fail (remain in place). Moreover, after sufficiently many 
such attempts the agent is bound to unlearn the implication 
responsible for this particular choice of action; this will have 
an overall negative impact on planning. 

2) Example: Agent on a Circular Rail: A subtly different 
example is that of ITE2( b) and II-P2 b). Here, motion along 
a circular rail is modeled by setting E to be the set of 
integers modulo N, with two available actions fwd and bck 
corresponding to the operations of adding and subtracting a 
unit, respectively (all arithmetic relating to the environment in 
this example is done modulo N). Position sensors have the 
form loc[f7i] where Ui = {i — l,i,i 1}. 

Eor simplicity consider a situation with N big and even, 
and assume the agent has complete knowledge of the correct 
poc set structure, which is the one generated by the relations 
(appendix A-A2|l: 


loc[t7i] < loc[Uj]* AA dist(l, j) > 2, 


(55) 


A.33 — at least in cases when E does not have the homotopy 


where dist(i,j) denotes the distance (modulo N) between 
the positions i and j, as well as: 


^Recall that a sub-complex L of a cell complex K is induced by a set of 
vertices V C if L contains every cell of K all of whose vertices belong 
to V. 


fwd A loc[[/i] 
bck A loc[[/i] 


< 

< 


loc[Ui+i ], 
loc[Ui-i] 


( 56 ) 


























16 


for all i G {0,...,N — 1}. Without loss of generality, the 
current state x of the system satisfies pos(a;) = 0. 

Let the specified target be T = {Up}, where p G 
(0,..., JV — 1} is sufficiently far from the origin (the current 
position of the agent) to accommodate a pair Ui, Uj such that: 

1) UiU Uj does not intersect C/p U C/q; 

2) Ui U Uj separates p from 0 (on the circle). 

Thus both the current state and the target region sat¬ 
isfy the constraints loc[{7i]* and loc[C/j]*, which implies 
that any geodesic in the model space joining the cur¬ 
rent model state with the model target set passes through 
l)(loc[C/i]*, loc[C/^]*), yet it is clearly impossible to guarantee 
these constraints by any of the available actions. 

It is, never the less, possible to extend this sensorium in 
a way that enables the effective learning of a target, as the 
numerical studies below demonstrate, by introducing into the 
environment a graded signal whose strength encodes a measure 
of distance to the target, while endowing the agent with sensors 
responding to the gradient of this signal. 

3) Closing the Loop: Excitation-Driven Navigation: From 
the preceding examples it is clear that additional sensing — 
most probably involving information regarding transitions — 
is absolutely necessary for overcoming the geometric and 
topological obstructions to the GRP algorithm: while the 
GRP algorithm may be considered as providing a reasonable 
reference dynamics for reactive planning, one must consider 
possible means for replanning in the face of failure. We 
conjecture that the notion of a snapshot is sufficiently simple 
and agile for such purposes: 

- Control of learning thresholds. At this stage of our 
research, no attempt is being made to control the learning 
thresholds Tab', it seems plausible, however, that having a 
high level of ’’frustration” cause the lowering of a relevant 
threshold may be used as a tool for locating exceptions 
to poc relations. 

- Introduction of new sensors. A principled mechanism is 
required for the introduction of combinations of existing 
sensors, such as Boolean functions thereof, or delayed 
conjunctions such as (|48]l, to become additional members 
of the sensorium. In particular, such a mechanism must be 
capable of responding to exceptions, or failures of GRP, 
as we had already discussed above. 

The need for self-adjustment in the sensorium opens the 
door to the introduction of auxiliary internal mechanisms of 
evaluating the position of the agent in the environment, or, 
more generally, the state of the system consisting of the pairing 
of the agent with the environment. For example, the settings 
described in the preceding paragraph suggests the introduction 
of an internal variable evaluating success (and failure) of 
invoking a planned action, while the need for closed-loop 
control suggests simple local control mechanisms based on an 
internally-defined ‘navigation function’ M- Many other ideas 
of internal behavior modulation ranging from varying notions 
of novelty, surprise and dependability ifSSll . Il64l . Il2^ . 1^ all 
the way to a multivariate model of human neuro-modulation 
mechanisms 1591 become relevant in this context. 





(d) Interval w/Random Sensors 








Fig. 8. Mean deviation from target for empirical (blue) and a dis¬ 
counted (red) agents (20 sensors each), as a function of time in four 
different settings, averaging over 50 runs. 


Excitation-Driven Agents. In the absence of tools for reactive 
replanning (our current situation), we have chosen to study a 
simplified notion of target, allowing us to close the control 
loop with a ’’motion” command generated with the aim to 
guarantee an immediate decrease in the value of an internal 
excitation signal. 

The simplest instance of such a controller, applied to the 
navigation setting, seems to be the following. In addition to 
a sensorium of the form described above in IIII-B3I and the 
examples that followed, assume that the agent possesses a pair 
of sensors better and worse, responding to the decrease and 
increase, respectively, in a fixed measure of distance to a target 
point in the environment E, over a single transition (think of 
this as a radically simplified sense of smell). 

Starting out as a lazy random-walking agenj^ the agent 
uses the algorithm of |III-B5| at each step to obtain an action 
resulting with better (target specification T = {better}) 
as its first priority. In the case of failure to produce such an 
action, the agent attempts to guarantee worse*, periodically 
attempting a random action so as not to get stuck in place 
(upon having figured out that wait < worse*). Figure 
presents a comparison between the mean behaviors of four 
different such agents simulated in the same settings as those 
analyzed in section |I^ 

important to stress that, following the discussion in 
and the formal results of the preceding paragraphs, 
the guarantee of the agents in figure |^a)-(c) finding their 
targets given sufficient knowledge of the correct poc structure 
on S is absolute. To see this, it suffices to verify for the true 
poc set structure on E that any position other than the target 
has associated with it a location sensor a = loc[{7] and an 
action a such that every state x with (a : x) = 1 satisfies the 
requirement that a{x) is closer to the target than x is. 

*®An action wait is available now, to let the agent stay put when it has 
found the target. 


It IS 

liras 
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The only remaining question is whether or not the control 
strategy we propose guarantees sufficient exposure for the 
agent to recover the poc relations necessary for it to capture the 
target. Figure [goffers some evidence in favor of an affirmative 
answer, based on simulations. 

IV. Discussion 
A. Topological Mapping and Planning 

Mapping methods vary from probabilistic representations 
iini, im, ia to hybrid representations in which precisely 
mapped contractible local metric patches are integrated into 
a global map through ‘gluing instructions’ recorded in an 
annotated graph that is learned by the agent in the course 
of its travels (Iia, 1361, 1371, IMl)- The latter are known 
as topological SLAM (simultaneous localization and mapping) 
methods. 

In the presence of significant obstacles in large scale en¬ 
vironments, the lightweight ‘topological skelteon’ of the en¬ 
vironment recorded by a topological SLAM method provides 
valuable information on loop closure, which would otherwise 
be costly to obtain using global metric mapping techniques. 
This observation led to an extensive effort l66l, l34l to produce 
a general notion of a topological map, based on the notion 
of spacial semantic hierarchy (SSH), that would allow for 
motion planning at varying scales. Other ways of leveraging 
planar topological descriptors to obtain simple and efficient 
data structures supporting localization and motion planning in 
simply connected planar domains have been explored as well, 
e.g. ED, ESI and M- 

In view of our stated need for guaranteed universality prop¬ 
erties, the strong dependence of SLAM methods on Euclidean 
geometry made it necessary for us to adopt a significantly more 
abstract approach to localization arising from the point of view 
of sensor fields. A precursor of our approach is a family of 
SLAM algorithms, known as RatSLAM, employing a neural 
network to simulate the function of place/pose cells in rats, 
e.g. 03, El, Ei. In a broad generalization of the neural 
computational engine underlying RatSLAM, lf33 tests the 
hypothesis that place cells with sensor fields in any sufficiently 
dense configuration should make it possible [for a rat] not only 
to localize well, but also to accurately represent the topology 
of the environment by estimating the nervj^ of the system 
of place fields from observations of near-synchrony in place¬ 
cell firing. It is shown through simulations that recovering 
the topological invariants of a connected planar arena, as 
well as some approximation of its geometry, is possible with 
a sufficiently dense network of convex place fields. Further 
evidence in support of this idea is the recently introduced ll40ll 
method for localization in an urban canyon, as well as the 
ELM architecture E3, using nerve-like information (nesting 
among convex polygons in the plane) as a means of encoding 
spatio-temporal context in an ‘episodic memory’ for a planar 
agent. 

A significant drawback of nerve estimation is that, by 
definition, computing nerve of a covering requires expo- 

"See nerve of a covering in 1331 . section 3.3. 


nential space in the cardinality of the coverinj^ At the 
same time, restricting attention to pairwise intersections only, 
and focusing on those of them that are empty turns out to 
guarantee a universal model space for each combinatorial 
type of this ’’reduced nerve”, by Sageev-Roller duality ll5^ . 
EB. By construction, snapshots are estimators of this reduced 
nerve, turned into a computational tool for navigating the 
corresponding model space after converting all relations of 
the form ”A n B negligible/unimportant” into relations of the 
form ”A implies B”. 

The idea of leveraging nesting relations among geometric 
descriptors of events for the purpose of properly representing 
context is a well-recognized and widely used tool in the 
literature, for example: nesting of planar domains is used in 
E3, as well as the more recent EH, und a notion of nesting 
for actions is used in a. What is new about the snapshot 
architecture is its application of this principle to the entire 
sensorium, including the set of available actions. 

Our numerical experiments with closed loop control suggest 
that a snapshot-driven agent with sufficient actuation and 
sufficiently rich sensorium is capable of learning a good 
approximation of the gradient field of a (discrete) Morse 
function despite been given no prior semantic information and 
starting out with random ‘motor babble’ for its navigation 
strategy. 

In fact, the snapshot architecture is flexible enough to 
trivialize the task of introducing discrete variants of complex 
motivational systems ll59l based on introspective sensing of 
signals quantifying (a) internally available resources (e.g. 
battery charge), (b) repulsion or fear of punishment, (c) 
attraction or anticipation of a reward signal (e.g., in the 
sense of navigation functions US or in the broad sense of 
Reinforcement Learning ifSTl l. and even (d) frustration over 
the failure of a plan i69\ and (e) measures of innovation 153 . 

El- 

We expect systems such as (a)-(c) to contribute to an agent’s 
planning capabilities from the point of view of the variety of 
tasks they would enable. Even more significantly, we expect 

(d) and (e) to contribute to the agent’s ability to improve 
the quality of the topological representation encoded in its 
snaphot. Namely, (d) could be used to facilitate chunking by 
serving as a signal driving the creation of new conditional 
sensors detecting inconsistent states of the model space, while 

(e) could drive the learning of useful complex actions, as 
has already been proposed for many other architectures E5l . 
a, im, Ei, resulting in improving the connectivity of the 
model space. Endowing the snapshot architecture with these 
capabilities is the most immediate goal for follow-up research. 

B. Connectionist Architectures 

From the earliest days of the field, even extremely simple 
neural networks with a very small number of neurons have 
demonstrated the ability to perform complex learning and 
control tasks GqI, GD, including complex symbolic struc¬ 
tures such as context-sensitive grammars ca, ES, d, and 

one must keep track of the intersections of all possible sub-collections 
of the covering 
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complex hierarchical schemata whose structural features vastly 
outnumber the physical components the actual network ll75l . 
Even simple feed-forward networks have been shown to be 
capable of exercising ‘intelligent’ control through stigmergy 
GD, the depositing of tokens in the environment. Furthermore, 
the merging of reinforcement learning methods with the com¬ 
putational power of ‘deep learning’ networks El, GSl has 
yielded architectures capable of matching and outperforming 
humans in complex tasks such as learning and playing video 
games ll79l based only on the raw RGB output provided by 
the game console. 

At the same time, the power of connectionist architectures 
comes with very limited formal understanding of how the 
internal representations they maintain encode symbolic rep¬ 
resentations in terms of problem spaces. 

Though the connection demonstrated in this work falls short 
solving this problem (due to the over-simplihcation of the 
connectionist structure and due to realization of plasticity in 
the network by a non-neural controller), direct analogies with 
current studies of plasticity and neural coding give hope that 
a suitable generalization of the snapshot architecture could 
yield both strength of performance and provable guarantees 
of symbolic reasoning processes. 

Of the classes of neural networks that are well understood, 
most relevant for our purposes is that of competitive attractor 
networks, whose strong stability properties lISOll were applied 
to modeling the navigation mechanisms in rats, also inspiring 
the RatSLAM mapper lfT3ll . 

Expanding on these results, ED, i9) sparked the discussion 
of the structure of the set of binary codewords corresponding 
to stable activity patterns of threshold-linear neural networks. 
This line of inquiry was picked up in ll82l . producing a 
combinatorial characterization of the possible codes in terms 
of the network’s organization; and in ll50l . initiating a rigorous 
study of the way codes vary as the synaptic weights are 
perturbed while subject to structural constraints, exposing 
interesting connections with topological invariants associated 
with these constraints. 

The analogy with our work is straightforward. The process 
of obtaining the coherent projection of a binary observation 
by propagating it through the derived graph of a snapshot is 
completely analogous to the process of propagating a signal 
through a threshold-linear neural network and waiting for the 
network state to stabilize at a code word — especially when 
taking into account the excitatory nature of relations of the 
form a <b and inhibitory role of relations of the form a < b* 
under propagation. 

Chasing this analogy, it could be worthwhile investigating 
the degree to which the collection of codewords of a threshold- 
linear network conforms to the strict demands of median ge¬ 
ometry (represented by coherent snapshot states), to establish 
a rigorous formal connection (if it exists) between the two 
architectures. A more general study of which neural learning 
methods ll83l could be transferred into a snapshot architecture 
may, on one hand, expand the range of applications for 
snapshots, while, on the other hand, provide some existing 
architectures with a rigorous symbolic interpretation. 


V. Conclusion 

We introduce a new computationally efficient architecture 
intended to endow a generic discrete binary agent with the 
capacity to build over time an actionable model of its opera¬ 
tions within a completely unknown dynamic environment, E. 
The proposed architecture has a dual nature. On one hand, the 
agent maintains an evolving data structure, — the snapshot 
S|j — of size quadratic in the number of sensors, encoding a 
planning mechanism based on propagation of excitation and 
inhibition signals through the highly plastic directed network 
Dir(S| J, and is, thus, in a very crude sense, a connectionist 
learning and control architecture. On the other hand, the rather 
specihc ordering properties of networks arising in this way 
(the derived ‘weak poc set structure’ P|p also characterize 
any such network as encoding a system of ‘half-spaces’ in a 
geometric internal representation M|^ that is just rich enough 
to account for all sensory equivalence classes provided to 
the agent by its sensorium E. This duality affords the re¬ 
interpretation of snapshots as encoding a high-level symbolic 
representation of the problem space (i.e., the state space X 
and the transition system induced on it by the agents interac¬ 
tion with its dynamic environment), through a mathematical 
formalism that rigorously supports symbolic planning with the 
efficiency and economy of a connectionist architecture. 

Clearly, our current snapshot architectures (section 11) lack 
certain key features found in existing AGI architectures. First 
and foremost among these is a mechanism for enriching the 
agent’s sensorium with sensors representing general Boolean 
predicates (or, even better, some limited LTL predicates), com¬ 
posed of the original atomic sensations. Of course, the problem 
lies not in proposing intuitively attractive approaches (there are 
many) but rather doing so in a principled, economical way that 
maintains the present combination of analytical and compu¬ 
tational tractability. These ‘compound’ sensors are required 
for facilitating chunking and the learning of useful motor 
primitives. Another required feature is a capacity for symbolic 
abstraction (relating problem spaces via substitutions). While 
the duality theory of weak poc sets and their model spaces 
(appendix |A-E| l affords a rigorous formulation of enriched 
predicates and consequent symbolic abstraction, it is not yet 
clear how to engineer an enlarged snapshot-like architecture 
realizing such meta-extensions. 
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Appendix A 

A Primer on Sageev-Roller Duality 

The duality between poc sets and median algebras, going 
back to ||8^ . was thoroughly studied by Martin Roller in EH, 
in a very successful attempt of pushing the envelope on a 
theory of actions of discrete groups on simply connected non- 
positively curved cubical complexes - henceforth reffered to as 
cubings - pioneered by Michah Sageev in ll5^ and by Victor 
Chepoi ll43l . who characterized such complexes in terms of 
the convexity theory of their 1-dimensional skeleta. 

This appendix provides a detailed review of the elements 
of this theory supporting the memory architecture proposed in 
this paper. This overview of the preliminary meterial is meant 
to extend the initial discussion provided in 1301 as well as to 
illustrate it with examples, intended as bridges to our current 
application. In the end, the duality theory of poc sets will be 
called upon to provide the necessary formal guarantees that the 
proposed memory and control architectures actually do their 
job. We will mainly rely on ED as a source of theoretical 
results, though sometimes it will be easier to use results from 
the elegant exposition in ll85l . 
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A. Basic Notions 

We introduced the extension of the duality theory of poc 
sets to so-called weak poc sets in BOl out of the necessity to 
maintain poc sets as dynamical data structures. 

Definition A.l (Weak Poc Set). A partially-ordered set (P, <) 
endowed with an order-reversing fixpoint-free involution a i— 
a* and having a minimum element 0 G P is called a weak poc 
set. Note that 0* is a maximum for P. Thus, for all a,b G P 
one has: 

• 0 < a = a** and a* ^ a; 

, a<b^b* <a*. 

An element a G P is said to be negligible if a < a*, and 
ubiquitous if a* is negligible. A poc set is a weak poc set 
in which 0 is the only negligible element. An element that is 
neither negligible nor ubiquitous is said to be proper. □ 

Weak poc sets form a category: 

Definition A.2 (Poc Morphism). A function f : P ^ Q 
between two weak poc sets is a poc morphism, if / is order¬ 
preserving, *-equivariant and /(O) = 0. The set of all poc 
motphisms as above will be denoted Hom(P, Q). □ 

1) The Minimal Poc Set: The set {0,1} with the relations 
0 < 1 and 1 = 0* is a poc set, and it is denoted by 2. Clearly, 
there is only one poc morphism of 2 into any weak poc set 
P, but then there may be many poc morphisms of a weak poc 
set P onto 2. 

2) Generators and Relations: A weak poc set P = (S' |P) 
may be specified using a set S of generators and a set of 
relations R of the form a < b or a* < b or a < b* for 
a,b G 6 p1 

Formally, P is constructed as follows. Assume that the sym¬ 
bol 0 is not contained in S. First, set S± := ({0}US) x {-f, —} 
and define (s,+)* = (s, —) and (s, —)* = (s,-|-). Thus, S± 
has a fix-point free involution * defined on it. For simplicity, 
for each s G {0} U S we identify (s, -I-) with s. 

The relation set R is required to be a subset of S± x S±. 
We then define an extension R^^^ of R to be the intersection 
of all relations W C S± x S± that are reflexive, transitive and, 
in addition, satisfy the following: 

• (0, a) G W holds for all a G S±; 

• For all a,b G S±, if (a, b) G W then {b*,a*) G W. 

We set P to be the quotient of S± by the equivalence 

a; ~ y (x, 2/) G A (y, x) G 
with the induced partial ordering 

[x] < [y] (x,y) G . 

For example, the notation 

(o, 6, c |a < c, b < c) 
stands for the poc set 

P = {0,0*,a,5,c, a\b*,c*} 

*^One may also use weak inequalities (<) to specify relations in R. 


having the order relations 

0<o<c<0*, 0<6<c<0* 

0 < c* < a* < 0*, 0 < c* < &* < 0* 

as well as the ones derived from these by transitivity. Thus, 
generators and relations provide a compact way of representing 
a (weak) poc set explicitly. 

As another example, consider the poc sets 

P = {a,b\a <b) , Q = {a,b |a* < b) 

The partial assignment f : P ^ Q satisfying 
/(a) = a* , f{b) = b 

has one and only one extension to a poc morphism of P into 

Q. 

3) a-Algebras as poc sets: Let be a a-algebra on a non¬ 
empty (possibly infinite) set X. Then X \ P) 

is a poc set. In particular, the power set of X, denoted 2^, 
obtains the structure of a poc set in this way. It is standard 
to identify 2^ with the space of functions / : X —2: 
any such / will be identified with the subset G 2^. 

Note that the intersection and symmetric difference operators 
translate under this identification into pointwise multiplication 
and addition modulo 2, respectively. Recalling our notation Q 
for the evaluation of functions, it will be convenient to extend 
it as follows: 

if :x) = /(x) , (/* : x) = 1 -f (/ : x) (57) 

The poc set structure on 2^ may then be written in functional 
form via 

f < g in 2^ V^^gx (/ : x) < (y : x) in 2 fg = f, 

(58) 

where f,gG 2 ^ are arbitrary elements. 

Definition A.3 (realization). Let P be a weak poc set and 
let X be a non-empty set. A realization of P in X is a poc 
morphism f : P —>■ 2^. □ 

A realization f : P ^ 2^ provides a consistent way of 
regarding each a G P as a binary question over X, so that the 
set of all X G X with (/(a) : x) = 1 is the set of all points 
where the question is answered affirmatively. 

4) Canonical Quotient: Every weak poc set has a canonical 
poc set quotient P obtained as the quotient of P by the 
equivalence relation 

{ a = b or 

a, b are both negligible, or (59) 

a,b are both ubiquitous 

and endowed with the obvious ordering and involution. 

Definition A.4. Let P be a weak poc set and let P denote 
its canonical poc quotient. For every a G P, we denote the 
equivalence class of a in P with d. The poc morphism a i-G d 
will be denoted by TTp. □ 

The main characteristic of P is the following elementary 
lemma: 
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Lemma A.5. Let P be a weak poc set. Then any poc morphism 
f : P ^ Q of P into a poc set Q factors through irp, that 
is: there exists one and only one poc morphism f : P ^ Q 
satifying f = fonp. 

For example, seeking a realization of a weak poc set 
structure in any space X is only possible after identifying all 
negligible elements with 0, because the only subset F of X 
satisfying FCX\FisF = 0 . 



(a) (b) (c) 


B. The Dual Graph of a Poc Set 

The duality theory for poc sets is an extension of Stone 
Duality ll86t . At the base of the construction are binary 
selections: 

Definition A .6 (^-selections). Let P be a weak poc set. A 
subset A C P is a ^-selection on P if no a S A has a* G A. 
A ^-selection A on P is complete if for any a G P either 
a G A or a* G A. The set of all complete ^-selections on P 
is denoted by S{P)'^. □ 

The following is a metric on S'(P)°: 

A(A,P) = |A\ P| = |P \ A| = i |A A P| (60) 

Indeed, fixing Ag G S{P)'^, an explicit isometry of 
(^(P)^, a) onto endowed with the Hamming distance is 
constructed by sending A G S{P)'^ to the [indicator function 
of the] set Ag \ A. Thus, S{P)^ may be thought of as simply 
being the vertex set, or 0 -skeleton, of the ^-dimensional 
standard unit cube, viewed as a combinatorial cubical complex, 
- we denote this complex by S{P) - where a cubical face Q 
of S{P) of dimension d corresponds to a maximal subset of 
S'(P)° with diameter d as calculated in the metric A(-, •). 

1) Construction of the Dual Graph: Some vertices of S'(P) 
cannot be witnessed in any realization of P: 

Definition A.7 (Coherence). A pair of elements a,b G P is 
said to be incoherent if a < b*. A subset A of a poc set P is 
said to be coherent if it contains no incoherent pair. □ 

Definition A .8 (Dual graph, dual Cubing). Given a (finite) 
poc set P, one defines: 

(a) The dual cubing of P, denoted Cube(P), is the sub¬ 
complex of S{P) induced by the set of coherent vertices; 

(b) The dual median algebra of P, denoted P°, is the 0- 
skeleton of Cube(P); 

(c) The dual graph of P, denoted Dual(P), is the 1-skeleton 

of Cube(P). □ 

2) A Simple Example: To illustrate the dehnition, consider 
the poc set P whose Hasse diagram is given in figure |^a). 
Given by generations and relations, P takes the form: 

P = (a, &, c |a < c*, 6 < c*) (61) 

A good way of thinking about a poc set is to pretend that 
it is realized in a space X, so that our P is a collection of 
three binary questions (a, b and c) about X, augmented with 
the complementary questions (a*,b* and c*), together with 
an additional record of known implication relations between 


Fig. 9. A simple poc set P on three generators (a) and the resulting cube 
complex (h), obtained by deleting all incoherent vertices from the cube S (P) 
(c) - see example |A-B2| 


them (a < c* and b < c*). In the absence of any implications 
on record, an observer endowed with P will model the space 
X as the full 3-cube S{P), where the proper elements of P 
correspond to the co-dimension one faces of the cube - hg. 
I^c). On the other hand, knowledge of the above relations 
renders some of the vertices of S{P) redundant, resulting 
in a reduction in the number of binary states necessary for 
modeling X using the same three questions - hg. [9][b). 

Remark A.9. We have chosen the term coherent subset 
over Roller’s filter-base to better ht the context of our plan¬ 
ning/sensing problem. 

Remark A.IO. The standard identihcation of 2^ with the 
space of {0, l}-valued functions on P also puts the set P° of 
coherent vertices of S'(P) in one-to-one correspondence with 
the set Hom(P, 2) of poc morphisms of P onto the trivial 
poc set 2 . 


C. Poc Set Duals are Median Graphs 

Graphs of the form Dual(P) (for a weak poc set P) are 
completely characterized. We recall: 

Definition A.ll (hop-distance, intervals). Let G = (V, E) be 
a connected simple graph and let u,v G V. The hop distance 
dc{u,v) is dehned to be the minimum length of an edge- 
path in G joining u with v. The interval I{u,v) is dehned 
to be the set of all vertices w G V satisfying the equality 
dG{u,v) = dG{u,w) -G dciw^v). □ 

A fundamental fact about the dual Dual(P) of a poc set P 
is a quick corollary of the results in ll85l . section 4: 

Lemma A.12. Let G = Dual(P) for a finite poc set P. Then 
the metric A coincides with the hop metric on G. 


An important and well-studied class of graphs is: 


Dehnition A.13 (median graphs ||43]| . 1871 '). A connected 
simple graph G = {V, E) is said to be a median graph, if 
the set I{u,v) n I{v,w) D I{u,w) contains exactly one vertex 
for each u,v,w G V. This vertex is the median of the triple 
{u,v,w) and denoted by med(u,u,w) - see hgure 10 □ 


Median graphs are a special subfamily of a family of ternary 
algebras, called median algebras, l88l . l89l , l90l , IMl . Some 
modem generalizations and applications may be found in l92l . 
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Fig. 10. Computing a median in the integer grid (see example |A-D2| for a 
poc-set presentation). 

Another way of stating the preceding lemma (again, see 
|[85l . section 4, where these results are derived in a much 
more general setting): 

Theorem A.14. The dual G = Dual(P) of a finite poc set P 
is a finite median graph, with the median calculated according 
to the formula 

med(M, V, w) = {uD v) U (uCiw) U {v Cl w) (62) 


1) Every convex set is an intersection of halfspaces; 

2) Any family of pairwise-intersecting convex sets has a 
common vertex (1-dimensional Helly property); 

3) For any convex subset K <Z V, the subgraph of G 
induced by K is a median graph; 

4) For any convex subset K <Z V and any vertex v K 
there exists a unique vertex proj^u € K at minimum 
hop distance from v. 

5) For any convex subset K <Z V, the closest-point pro¬ 
jection proj^. is a median-preserving, distance non¬ 
increasing map of G onto the subgraph of G induced 
by K. 

Any graph G = {V, E) generates a poc set Poc(C?): we let 
the underlying set of Poc(G) be the set of all half-spaces of 
G, then we order it by inclusion and set H* = V'^H for the 
complementation operator. Of course, some graphs (e.g. any 
odd cycle) will generate the trivial poc set in this way, but not 
so for median graphs: 

Theorem A.20 l llSTl . proposition 5.9). Let G be a fi¬ 
nite median graph. Then G is canonically isomorphic to 
Dual(Poc(G)) via the median-preserving map which sends 
each vertex v to the collection of halfspaces of G which 
contain v. 


for all u,v,w G P°. 

In other words, the median of three coherent ^-selections 
is determined by a majority vote on the values of their 
observations. 


D. Convexity 

Median graphs have a very strong convexity theory. We 
recall: 


Definition A.15 (Convexity). Let G = {V, E) be a graph. A 
subset AT C 1/ is said to be convex if I{u,v) C K holds 
whenever u,v G K. □ 

Definition A.16 (Half-Spaces). Let G = {V, E) be a graph. 
A subset H C V is said to be a half-space of G, if both H 
and V \ i? are convex subsets of G. □ 


For example, lemma A.12 can be used to quickly deduce 


Lemma A.17. Let P be a poc set. Then the half-spaces of 
Dual(P) are precisely the subsets of P° of the fonn 


l)(a) := {u G P° \a G u} (63) 


where a ranges over P. In particular, subsets of P° of the 
fonn 

i)iK) := {u G P° \K C u} = f] i){a) (64) 

aGK 

are convex in Dual(P). 

Remark A.18. Note also that !)(«*) = \ f)(a) for all 

a G P. 


Much more can be said in general: 


An important conclusion (special case of proposition 6.11 
in llll) is: 

Corollary A.21. For any finite poc set P, the map a i—)■ ()(a) is 
a poc-isomorphism of P onto Poc(Dual(P)). In other words, 
the poc-set P may be reconstructed from its dual. 


From a practical standpoint, these two results offer an 
approach to understanding the geometry of Cube(P) in terms 
of the order structure of P, which is the purpose of this and the 
following sections. As an aside, let us mention also that these 
results are best viewed together in categorical terms, as part of 
a restatement of the duality between the category of finite poc 
sets (with poc morphisms) and the category of finite median 


algebras (with median-preserving maps) — see appendix A-E 
below. 

We must consider the possible relations (if any) among 
elements a.b G P. Those are: 


a < b, a* < b, a* < b* , a < b* (65) 


It is easy to see that a pair of distinct proper elements will 
never satisfy two of the above conditions at the same time, 
as Cube(P) provides us with a realization of P inside 2^ - 
after all, the last theorem tells us that: 


a <b f)(a) C [}(6) (66) 

Definition A.22 (Nesting and Transversality). Suppose a, b 
are proper elements of a weak poc set P. We say that they 
cross {a fh b), if none of ( |65| ) hold. Otherwise, we say they 
are nested (a||6). A subset A of P is said to be nested if all 
its elements are pairwise nested, and transverse if its elements 
cross pairwise. □ 


Theorem A.19 (Properties of Median Graphs, ED, section Thus, the half-spaces of Dual(P) are nothing more than the 
2). Let G = {V,E) be a finite median graph. Then: restriction to Dual(P) of the half-spaces of the cube S{P)^, 
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(a) (b) 

Fig. 11. Dual graphs for two arrangements of sensors along the real line (see 
|A-D1[ : threshold sensors encoding a path (a), and beacon sensors encoding 
a ‘starfish’ (b). 


with two of them nesting if and only if the corresponding 
elements of P are nested, that is, if and only if exactly one of 
the following holds: 

f)(a) n f)(6) = 0 , ()(a*) n f)(6) = 0 , 

i){a*)ni){b*)= 0 , i){a)nt){b*)=0 ^ ’ 

We conclude that the more relations are on record in the order 
structure of P the fewer transverse sets there are to be found 
there. In other words, nesting relations are an obstruction to 
high-dimesional cubes in Cube(P); each additional relation in 
P implies fewer faces of the original cube S{P) survive the 
culling of incoherent vertices used for obtaining Cube(P). At 
one extreme one finds Cube(P) = S{P) when P itself (up 
to removing improper elements) is transverse. At the other 
extreme (exercise for the reader) Cube(P) forms a tree if and 
only if P is nested. 

1) Example: the path of length N: Consider an idealized 
point-robot situated on the interval E = [0,1] and capable 
of moving about in this environment. Suppose the robot is 
endowed with binary sensors oi,..., uat, each responding to 
the robot’s position - denoted for now by a: - according to 
the rule, say, that Ofc turns on whenver x < Xk ■= k/N. It 
would be reasonable for us to wish for the robot to eventually 
be able to realize that turning on implies ak+i turning on, 
for all k < n. Forming the poc set 

P = (tti,... ,aAr lofc < Ofc+i, k = l,...,N-l) ( 68 ) 

it is easy to verify that Dual(P) is the A^-path - the path with 
N + 1 vertices and N edges - whose vertices are all of the 
form 

= , 0 < fc < iV (69) 

Please note that the choice of the points ccfc € E is immaterial 
- only their ordering should matter for the correctness of 
Cube(P) as a discretized model of the ‘environment’ E of 
our robot. 

At the same time, imagine that the sensors au corresponded 
to ‘beacons’, with au turning on if and only if |a: — | 

Then a poc set description of the form 

P = (ai, ... ,aN \ak < a*, 1 < k < j < N) (70) 

would be more appropriate, indicating that the sensations Ofc 
are mutually exclusive. The resulting dual would still have 



(a) 


(b) 


Fig. 12. Cubical models for example |A-D3| with relations (a) ai < 
where x € {2, 3,4} and addition is modulo 6, and (b) only the 
relations Ui < a *.^3 are present. Black vertices are those coherent 
in for both poc set structures. Vertices painted white are coherent 
vertices for agent ^2 that are incoherent for agent #1. The vertex 
V corresponds to the shared coherent *-selection {oq, ... ,al}. 


N + 1 vertices and N edges, the vertices being; 
v'k = {0*,ak}^{a*}j^k for 1 < k < N 

v'o = {o*}u{at,...,a;;v} 

In both cases the dual graph is a tree (a path and a starfish), 
and it is hard to ignore the difference in the quality of its 
representation of the underlying space - see figure 11 

2 ) Example: direct sums of poc sets: The easiest way to 
join two poc sets together is to form their direct sum: 

Definition A.23. Let P and Q be poc sets. Their direct sum 
P y Q is defined to be the quotient of their external disjoint 
union P U Q by the identification Op = Oq and 0*p = 0 *q, 
endowed with the following: 

• a < b in P y Q iff a,b G P and a < b or a,b G Q and 
a < b; 

• b = a* iff both a,b G P and b = a* or a,b G Q and 
b = a*. 

(We abuse notation by identifying each element of PUQ with 
the equivalence class in P V Q of its natural representative in 
PUQ) □ 

It is easy to verify, then, that 

Cube(P V Q) = Cube(P) x Cube((5) (72) 


where the isomorphism is that of cubical complexes. Intu¬ 
itively, any proper elements a G P and b G Q satisfy a (f b, 
resulting in every cube in Cube(P) and every cube in Cube((5) 
to form a product cube in Cube(P VQ). For example, the grid 
in figure [T^ may be thought of as the product of an W-path 
with an M-path (for the appropriate values of M and N) - 
hence the dual of the direct sum of two poc sets of the type 


discussed in the preceding example A-Dl 




































25 


3) Example: a cycle of length 6: Imagine an agent - call it 
- living on the unit circle E = We mark six vertices, 
spread uniformly along the circle, with the digits {0,..., 5}. 
Suppose that agent is capable, for each position it occupies 
on E, of asking any of the binary questions 

• Aj'. Am I positioned at arc length< ^ from position j 
along E? 

Agent ^2 asks a slightly different set of questions; 

• Bj'. Am I positioned at arc length< ^ from position j 
along E? 

The questions available to either agent have sufficient reso¬ 
lution to pinpoint the agent’s position wherever it is, but we 
claim that the collection {Aj}^^^ is, in a sense, more efficient 
than Let P = {0, 0*}U{ai, where the at are 

symbols to represent the sensations corresponding to Ai for 
agent and to for agent ^2. We compare the resulting 
embeddings pi : P ^ 2^ defined by 

pi{aj) = Aj , pi{a*) = V \Aj, 

P2{aj) = Bj , P2{a*) = V \Bj , 

and with Pi{0) = 0 and Pi{0*) = V, of course. We observe 
that both representations of P in 2^ form injective poc 
morphisms of P into 2^ if P is given all relations of the 
form Qi < a *_|_3 (addition modulo 6), yet only agent ^1 can 
afford to also add the relations Ui < a *_|_2 and Ui < a *_|_4 to the 
record without losing the property of pi being a poc morphism. 
The difference between the duals is significant - see figure [T^ 
- clearly showing the advantage of the compact and simple 
world map that agent could deduce over the cumbersome 
monstrosity agent ^2 must deal with. Note how the complex 
(a) in the figure may be obtained from (b) through deleting 
the vertices painted white - those are precisely the vertices 
of (b) forming incoherent families for the poc set structure 
represented in (a). 

Two aspects of this example are noteworthy: 

1) The less nested poc set of the two example poc set 
structures is capable of accommodating both agents, thus 
giving us the means for comparing them. 

2) With none of the agents having direct access to the 
realization maps, they should be looking for efficient 
means of evolving their maps in an adaptive fashion so 
as to produce a good enough symbolic approximation 
of the ground truth. 

E. Cubings and the Duality Theory of Weak Poc Sets 

1) Sageev-Roller Duality from the Categorical Viewpoint: 
In the finite case, the duality theory of poc sets has a very 
clean formulation in category-theoretical terms. For a quick 
review of the basic notions of Category Theory we refer the 
reader to 1^ chapter 4, while here we will stick to the specific 
categories of interest; 

• Poc/, the category of finite poc set^^ where each 
P,Q (z Poc/ have assigned to them the set Hom(P, Q) 
of poc morphisms from P to Q; 

*^One could work with the full category Poc of all poc sets (rather than just 
the finite ones) but this introduces major complications that seem unnecessary 
given the cuuent application. Similarly for the case of median graphs/algebras. 


• Medy, the category of finite median graphs, where 
each G,H G Med/ are assigned the set Hom(G, H) 
of median-preserving maps from the vertex set of G 
to the vertex set of H (such maps are called median 
morphisms). 

What connects the two categories is the assignment of the 
graph Dual(P) to every poc set P. The important bit here 
is that this assignment is not confined to the level of objects, 
but, rather, extends over the level of maps as well in a natural 
way; 

Definition A.24. Let f : P ^ Q he a morphism of weak 
poc sets. The dual map f° '■ Q° ^ P° is defined to be the 
pullback map f°{A) = f~^{A). □ 

It is easy to verify that f° : Q° —t P° is a median¬ 
preserving map, that is; 

f° {med{u,v,w)) = med{f°{u), f°{v), f°{w)) (73) 

where the medians are computed in the appropriate du¬ 
als. Thus, a map / G Hom(P, Q) yields a map f° G 
Hom(Dual(Q), Dual(P)). Moreover, one easily checks that 
this is done in a way that respects composition, that is; 

(9of)° = rog° (74) 


whenever the composition of the poc morphisms /, g is well- 
defined. This notion of map between categories is called a 
functor. The above constructions (of the dual graph and the 
dual map), together, are known as the Sageev-Roller duality. 

an 4A.21 


Applying the results A.20 


we conclude that the 
above assignments form a complete duality, or co-equivalence 
of categories, between Poc/ and Med/. That is, there are: 


• A correspondence between Poc/ and Med/ at the 
levet of objects: P Dual(P) is a one-to-one 
correspondence between the collection of finite poc sets 
and the collection of median graphs; 

• A correspondence between Poc/ and Med/ at the 
levet of maps: / i—> /° is a composition-reversing 
one-to-one correspondence between poc morphisms and 
median morphisms. 


Thus, Sageev-Roller duality is a dictionary, translating order- 
theoretic statements about finite poc sets into graph-theoretic 
statements about finite median graphs and vice-versa. Loosely 
speaking, the aspects of Boolean Algebra covered by poc 
sets may be conveniently interpreted in terms of the con¬ 
vex geometry of median graphs, reasoned about within this 
framework, and the conclusions may then be translated back 
into the Boolean Algebra setting for the purpose of dealing 
with applications. We will now proceed to survey some of the 
contributions of this category-theoretic point of view to our 
application. 


2) Extending Sageev-Roller Duality to Weak Poc Sets: 
Recalling the fact that the category Poc / of proper finite poc 
sets is too restrictive for the purposes of our current application 
El, the first order of business is to verify that Sageev-Roller 
duality extends to weak poc sets. 
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The first observation regarding dual maps is a consequence 
of the fact that no coherent subset of a weak poc set may 
contain a negligible element: 

Lemma A.25. Let P be a weak poc set and let tt : P ^ P 
denote the canonical projection. Then p° : P° ^ P° 
is a median isomorphism, making Cube(P) and Cube(P) 
isomorphic cubical complexes. 

Thus, weak poc sets are indistinguishable from poc sets, 
as far as dual graphs are concerned: applying Sageev-Roller 
duality one simply obtains 

Corollary A.26. For any weak poc set P, P is naturally 
isomorphic to Poc (Dual (P)). □ 


At the same time, weak poc sets form a more flexible class 
of objects. In particular, weak poc set structures are easier to 
represent and evolve dynamically using snapshots 


3) Example: Realizations: Suppose X is the state space 
(possibly infinite) of some system, and S is a collection of 
binary sensors sensitive to the state of the system, such as 
in examples |A-D1 and A-D3 Since the sensors are binary 
we may assume that each sensor a € S comes paired with a 
sensor a* S S corresponding to the negation of a. In other 
words, the sensorium S comes equipped with a fixpoint-free 
involution * and with a realization map p : 'S ^ C 2^ 
satisfying p{a*) = X \ p{a) for all a € S, where ^ is a 
prescribed tr-algebra of measurable events in X. It also costs 
us nothing to assume there is a special sensor 0 G S that is 
never on, that is: p(0) = 0. 

Suppose now that, having spent some time observing state 
transitions in X, we are able to write down some implication 
relations among the elements of E. These will be recorded 
in the form of a partial ordering (<). We would like to use 
our a-priori knowledge that p{a*) = X \ p{a) and, naturally, 
we would like to believe that a < b implies p{a) C p{b), 
in which case p becomes a morphism of the weak poc set 
P = (E,<,*) into the poc set SS. Assuming this is correct, 
what can we say? 

For any observed state x G X the poc set supplies us 
with a coherent ^-selection = {A G \x G A}. The dual 

p° : > P° then produces a vertex Vp{x) := p°{!^x) in 

Cube(P). 


Definition A.27 (Consistent Vertices). Let p be a realization 
of a poc set P. Then the vertices of the form Vp(x) as above 
are called p-consistent vertices. 


For example, the vertex v in figures 12 (a) and (b) is 
inconsistent for either realization. 


It is possible that Up : X —)• P° is not surjective, motivating 
the definition: 


Definition A.28 (Punctured Dual). Let p be a realization 
of a poc set P. The punctured dual (with respect to p), 
denoted Cube^(P), or Cube(P, p), is the cubical sub-complex 
of Cube(P) induced by the set of p-consistent vertices of P° 
which are contained in Cube(P). 

As a corollary of this discussion we obtain: 


Corollary A.29. Cube(P) is the smallest cubical sub-complex 
of S{P) with the property that, for any realization p of P, if 
p is a poc morphism, then Cube(P) contains all p-consistent 
vertices of S{P). 

In other words, any realization that is also a poc morphism 
gives rise to a discrete representation of X in Cube(P) via the 
mapping Vp : X —> P°. The benefit of maintaining a record of 
the order in E is our ability to discard the incoherent vertices 
of S{P) (viewed as possible states of the observed system) 
without the risk of losing any information about X, while 
possibly gaining some insight into the organization of X, as 
stated in the introduction, contribution (i). 

4) Realizations, Cubings and Topology: We recall a defi¬ 
nition from ll5^ : 


Definition A.30. A cubing is a simply connected, non- 
positively curved cubical complex. □ 

We point the reader to 121 for a detailed account of non- 
positively curved metric spaces. For the purpose of this paper 
it will suffice to quote a corollary of the well-known Cartan- 
Hadamard theorem ( ifSTll . 11.4.1): 

Corollary A.31. Cubings are contractible. 

We owe the following theorem in its full generality (finite 
and infinite cases) to the collective efforts of Michah Sageev 
1561, Martin Roller OTl and Victor Chepoi ll43l . 

Theorem A.32. The following are equivalent for a finite 
simple graph G: 

1) G is the 1-dimensional skeleton of a cubing; 

2) G is a median graph; 

3) G is isomorphic to Dual(P) for some poc set P; 

4) G is the 1-dimensional skeleton o/Cube(P) for some 
poc set P. 


Further developing the discussion in the preceding para¬ 
graph, we recall one of the main theorem of f30l, applied to 
that setting: 


Theorem A.33. Let li. be a topological space and let P 
be a poc set structure on a finite set E with realization 
p : P ^ 2^. Let 'fi’ denote the collection of cubes in the 
cubical complex Cube^(P) = Cube(P,p). For each C G 
let Xc = {r°) (C) be the set of all points in X witnessing 

G. If p is a poc morphism, and every Xc, G G has an 
open neighbourhood Af: C X such that: 

1) each jVc contractible; 

2) {XcIcg*” <^nd {jVc}c<z'ig have isomorphic nerves, 
then Cube^(P) is homotopy-equivalent to X. 


To illustrate the theorem, consider figure [T^ again: note how 
deleting the vertex v from either discrete model of the circle 
results in a space with the correct homotopy type (that of the 
circle). On the other hand, going back to example A-Dl and 
figure 11 the last theorem explains the qualitative difference 
in representations of the interval between the two provided 
models: while the ‘thresholds’ model satisfies the requirements 
of the theorem, the ‘beacons’ model possesses a vertex - the 
one marked Vq - whose set of witnesses is not connected. 
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This result could be interpreted as stating a condition on 
the richness of the sensorium (S,p): the complex Cube(P) 
provides an observer with a discretized contractible model of 
the state space X of the observed system, while Cube^(P) 
is a more realistic model of X taking the standard form of a 
contractible space minus a set of obstacles. 

5) Example: A ‘bad’Poc Morphism: It is not true in general 
that the dual of a poc morphism f : P ^ Q extends to a 
morphism of graphs. For example, consider the situation 

P = (a, 6, c |a < 6, 6 < c), Q = {x,y\x<y) (75) 

and / : P —>■ (5 is defined by /(a) = f{h) = x and /(c) = y. 
The duals and dual map are illustrated in figure [T3| 



Fig. 13. The dual of a poc morphism is not necessarily a graph morphism 
(details in [A-ES) . 


The absence of a canonical choice of extension for f° to a 
graph morphism of Dual((5) into Dual(P) hints at a solution 
directly involving cubings: if one were to extend the range of 
f° to include the 2-dimensional cube shown in the figure, it 
would have been possible to extend f° to a cellular map taking 
the edge of Dual((5) crossed by x to an appropriately chosen 
diagonal of that cube. More generally, it is possible to extend 
f° to a continuous embedding of Cube ((5) into Cube(P) 
for any poc morphism / ; P —> (5 by applying convexity 
properties of the canonical piecewise-Euclidean metrics on 
Cube(P) and Cube(Q) l ll57l . II.2.7). Thus, although median 
graphs are sufficient for describing the dual graphs of poc sets, 
describing the dual morphisms requires the higher dimensional 
geometry of cubings. 

6) Example: Degeneration: Recall our promise to maintain 
weak poc set structures on a sensorium S dynamically, updat¬ 
ing the ordering on E in real time. The duality theory of poc 
sets provides us with a hint as to how such maintenance should 
be done. The learning methods of section are motivated by 
the an analogy between the following observations and the 
ideas underlying Hebbian learning; 

The kind of update we expect to see in an instance of 
learning is captured in the following simple example. 

Pi = (a, 5, c |a < c, & < c) , 

P 2 = {a,h,c\a <h < c) 

The two poc set structures have the same underlying set 
(denote it by P) and the identity map / = idp : Pi —>^ 
P 2 is a morphism, while the inverse map g — idp : P 2 —>^ 
Pi is not. Thinking of Pi as representing an observer yet 
undecided regarding the nature of nesting (if any) of the pair 
{a, b} and therefore maintaining a fh 6 in Pi, we see poc set 



Fig. 14. The dual of a degeneration is an embedding of median graphs (details 
in |A-E61 . 


P 2 as representing an observer with an identical set of beliefs 
except for the additional relation a < b. Figure 14 visualizes 
the dual map f°. 

In general, if Pi and P 2 are poc sets with the same 
underlying set P and / = idp : Pi —> P 2 is a poc morphism, 
then the dual f° has the following properties (see e.g. lISTl '): 


Proposition A.34. Suppose f ■ Pi ^ P 2 is a bijective poc 
morphism. Then: 

1) f° is injective ( 4571/ . proposition 7.8); 

2) f° extends to an injective cellular embedding of 
Cube(P 2 ) in Cube(Pi); 

3) The image o/Cube(P 2 ) under this embedding is a strong 
deformation retract 0 /Cube (Pi). 


A more complicated instance of this setting is very nicely 
visualized in figure \T2\ 


Appendix B 

Proofs of Technical Results 
A. Proof of proposition |//. 7| 

We first extend the weight function ab 1 — Wab to a 
symmetric function of E x E by setting 

Waa = Wab + Wab* , Waa* = 0 (76) 

for any o € E and for any 6 € E with ab G K^. The 
consistency constraint ( fO] ) implies that this extension is well- 
defined. This allows us to extend the definition of uj(.) in ( fTb] ) 
to the whole of E x E while satisfying the following identities 

uj(ab) + uj{ba) = 0 , uj{aa) = 0 (77) 

for all a,b G Ti. Note that a;(.), by definition, takes directed 
edges and loops for its input. 

Further, the consistency identity allows us to write, for any 
ab G Ks: 

Uj{aa*) = Wa*a* - Waa (78) 

= Wa*b + Wa*b* - Wab - Wab* (79) 
= uj{ab) + uj{ab*) (80) 

We next observe that the cocycle constraint may be 
rewritten in the form 

oj(ab) + u:(bc) = oj(ac) (81) 

Let us verify that this identity is, in fact, an identity over all 
a,b,cG E. Due to the symmetries of a;(.) in ( |77] i it suffices 
to verify only in the following cases; 
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1) The pair ab S Ks and c = a: this case is taken care of 
by the anti-symmetry identity of a;(.). 

2) The pair ab G and c = a*: this is precisely the 
statement of ( |80| . 

3) The pair ac G Ks and b,cG {a, a*}: without loss of 
generality, either b = c = a and ( |8T] i ends up claiming 
that 0 + 0 = 0, or 6 = a and c = a* - in which case 
the statement turns into the trivial identity uj{aa*) = 
uj{aa*). 

(cases 1-2 correspond to exactly two of the pairs being proper; 
case 3 accounts for all situations when none of the pairs is 
proper; having exactly one of the pairs proper is impossible). 

Now suppose that p = (oq, ..., a^) is any directed vertex 
path in the given poc graph F. Then, applying the identity 
( |8 T] i repeatedly we obtain; 

uj{aoam) = w(aoai) + ... + uj{am-iam) ( 82 ) 

By the assumption on F, all the summands on the right hand 
side are positive. In particular, if p were a cycle with am = oo 
we would have obtained 

0 = a;(aoOo) = 02 ( 000 ^) > 0 (83) 

- a contradiction. We must therefore conclude that directed 
cycles in F are impossible, as desired. 


B. Proof of lemma 11.15 


An evolution of the trivial snapshot is an empirical snapshot, 
for if S = S|^ can be written in the form S = * • • • * Oi * 

Null then defining indicators 


cl, = {to, : a) ■ {to, : b) (84) 


- compare with o - one would have the following identity 
holding for S: 

t 

Wab ='^cl,G (85) 

fc=l 


Since the functions satisfy the consistency constraint, so 
does their sum w.. The clock requirement is satisfied, too, 
since for any proper pair {a, 5} one has: 

t 

Wa +Wa* ='y' {Cab + Cab- + Ca*&* + Ca-b) = t (86) 
fc=l' ^ 

- compare with equation ©■ 

Conversely, due to the presence of an integer clock, it 
suffices to show that any empirical snapshot S can be written 
in the form S = O * T where O = and T is either trivial 
or empirical. Let T be the weighted graph obtained from S 
by performing the following operations: 

1) Subtract one unit from Wab for every proper pair a, 6 S E 
satisfying {a, b} C O. 

2) For any a S S, set for T to 1 iff lUa > Wa- in T. 
The set is a complete ^-selection by construction, so it 
remains to verify the consistency and synchrony conditions 
for the new snapshot. For every proper pair a,b G S, the fact 
that O is a ^-selection implies that all but one of the edge 


counters in T|q;, coincide with their S counterparts, while the 
exceptional one - let it be Wab without loss of generality - is 
smaller than its counterpart in S by one unit. Since the sum of 
edge counters in S|ab is independent of the choice of square, 
we conclude the same is true for T|q;,. To prove consistency 
we observe that Wab + Wab- suffering a decrease (of one unit) 
in the passage to T implies a G O and hence Wac + Wac* 
must suffer a decrease as well for any c ^ {a, a*}, since O 
is a complete *-selection. Finally, with Wa being well-defined 
in T for all a S P, the choice of guarantees that =fa = 1 
is only possible in T if iCq > 0 in T. 

C. Equivalences in probabilistic Snapshots 

Throughout this section, S is a probabilistic snapshot satis¬ 
fying the triangle inequality ( [3^ . The reason for the name is 
that the symmetric dissimilarity measure on E x E defined by 

p,{ab) := Wa-b + Wab- > 0 (87) 

allows rewriting ( [3^ in the form 

p{ac) < p{ab) + p{ac) (88) 

for all a,b,c G E. 

Now let us turn to the purpose of this discussion, the 
analysis of equivalences in E that are observed in S. Let 
Eq{S) denote the undirected graph by vertex set E having 
edges ab and a*b* for every ab G Ks satisfying ( [3T] l. Let E 
denote the partition of E induced by the connected components 
of Eq{S), and let eg : E —^ S denote the map sending a S E 
to its block in E. In other words, eq{a) is nothing more than 
the list of all sensations 6 S E deemed equivalent to a, either 
directly or through a finite chain of equivalences. 

Returning to the notation of appendix |B-A| and keeping in 
mind ( |7^ we observe that; 

a = & +> uj{ab) = Wa-b - Wab- = 0 (89) 

and since uj{.) is additive ( [8T] l we conclude: 

b G eq{a) +> uj{ab) = 0 (90) 

In particular, since uj{ab) > 0 whenever ab G Dir(S), we 
conclude; 

Lemma B.l. If a, b G E are connected by a directed path in 
Dir(S) then eq{a) n eq{b) = 0. 

Similarly, for the metric /i(.) on E, we have that /i(a&) = 0 
if a = 5 by definition. The triangle inequality allows us to 
conclude: 

Lemma B.2. If b G eq{a) then p,{ab) = 0. 

Proof: We have b G eq(a) iff there is a hnite sequence 
of elements in E of the form 

a = ao = ai = ... = an = b, n > 0 (91) 

Applying the triangle inequality n — 1 times we obtain 

n 

fi{ab) < ^ fi{ai-iai) = 0 , (92) 

as desired. ■ 
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In order to see that Dir(S) actually defines a weak poc set 
structure on S we need the following: 

Lemma B.3. Given S as above, for all a G Tj we have: 

1) eq{a*) = eq{a)*; 

2) eq(a*) eqla). 

(Recall the convention A* = {x* \x G A'\ for A C Sj 

Proof: For (1) it suffices to observe that a = b implies 
a* = b* by construction. Assertion (2) follows from observing 
that 


eq{a*) = eq{a) eq{a*) n eq{a) 0 ^ a* G eq{a) (93) 

However, by the preceding lemma, a* G eq{a) would imply 
/i(aa*) = 0 which, in turn, would contradict the obvious 
equality 

p{aa*) = Wa*a- + Waa = 1 (94) 


finishing the proof. ■ 

The following proposition summarizes our progress thus far: 

Proposition B.4. Let S be a probabilistic snapshot satisfying 
the triangle inequality. Then: 

1) The operation eq{a) i—)■ eq{a*) defines a fixpoint-free 
involution on E. 

2) The directed graph F with vertex set E and with an edge 
pointing from eq(a) to eq{b) iff there exist a' G eq(a) 
and b' G eq{b) such that a'b' is an edge of Dir(S) is 
an acyclic poc graph. 

3) Let P = Poc(Dir(S)) and P = Poc(r). Then the 
quotient map eq : Ti ^ H is a poc morphism of P onto 
P with the property that every fiber of eq is a transverse 
subset of P. 

4) For any subset A C E one has 


At=eg~^(eg(A)t) (95) 

In particular, propagation over P is equivalent to prop¬ 


agation over Dir(S)o (see defn. IL20\. 


□ 


D. Proof of proposition III.l 


This proof requires the results of section A-E The inclu 


incact ■ Act|^ ^ : inCobs ■ C)bs|j ^ P|^ (96) 

satisfy projact = and projobs = inc-obs’ ^ = 

Act U Obs and Act n Obs = {0, 0*}, we conclude that the 
identity map idp of P is, in fact, a surjective poc morphism 


idp : Act I V Obs I —P| 


(97) 


(see 


A-D2 


for a definition of Act|^ V Obs|^). By proposition 
A. 34 the dual of this map is a median-preserving embedding 


of cubical complexes: 


idp : Cube(P| J ^ Cube(Act| J x Cube(Obs| J , (98) 


E. Local Structure of Duals and Greedy Navigation 

In we suggested exploring the link between the con¬ 
vexity theory of duals of weak poc sets and planning in DBAs, 
yet the formal results contained therein proved insufficient 
for supporting the planning algorithms proposed in this paper. 
This section fills in this gap. 

Throughout this section we fix a finite weak poc set P 
and the median graph F = Dual(P) (which is to say, F is 
an arbitrary finite median graph). We study the problem of 
computing the image of a non-empty convex subset V(S) of 
F under the closest point projection of F to the convex subset 
V{T). 

1) Gates: We recall the following definitions and results 
from 11311 : 

Definition B.5 (Separator). Let K, L 0 P° be sets. The set 
sep(A, L) = {aGP\K C V{a ), L C V{a) } (99) 

is called the separator of K and L. □ 

The inequality A{u,v) > |sep(iT,L)| follows immediately 
for all u G K and v G L. This motivates: 

Definition B.6 (Gate). Let K,L C P°. A gate for K,L is a 
pair of points u G K, v G L such that A(m, v) = |sep(A, L)\. 
□ 


The following result is well known in our setting: 

Proposition B.7. Let K, L be non-empty convex subsets of F 
and let u G K and v G L. Then u, v form a gate for K, L 
if and only if pro = u and proj^u = v. Moreover, any 
pair of non-empty convex subsets of F has a gate. 

We will apply this proposition without proof. An important 
consequence for us is the following: 

Lemma B.8. Suppose K = f)(iS') and S G P is coherent. 
Then, for any a G P, if K C t) (a) then there exists s G S 
such that s < a. 

Proof: Let u G K and v G L := [}(a*) form a gate. Since 
V A, there exists s G S such that v G 

Suppose there were a w G B with w G l)(s), and consider 
m = med(u, Ujw). Then a G v,w implies a G m, but the 
inequality 

A(m, v) = A(u, m) -|- A(m, v) > A(m, to ) (100) 

implies m = v, since v = proj^rt. On the other hand, s G 
u, w implies s G m - a contradiction. 

Thus, we have shown that L = t)(o*) is contained in 
Equivalently, a* < s*, which is the same as s < a. ■ 

The same kind of reasoning yields: 

Lemma B.9. Suppose K, L are non-empty convex subsets of 
Dual(P). If K n L 0, then proj^L = proj^A = K G L. 

Proof: Clearly, if v G K G L then proj^(u) = v, so 
K G L G proj^A. Eor the reverse inclusion, suppose v G 
projj^A and write v = proj^w, u G K. Pick any point 


as required. 
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w G K n L. Setting m = med{w,v,u) we note that m G L b G u \ A satisfying b < a. This observation was first made 
(because w,v G L) and in 15^ . leading to the following results in our setting: 


A(u, v) = A(m, m) + A(m, v) > A(m, m) . 


The uniqueness of projection forces v = projj^ 


u to coincide 
K we also have m G K, 


with TO. However, since w,u G 
showing V G K Cl L. ■ 

2) The Coherent Projection: We need to study a technical 
notion motivated by the necessity in correcting the observation 
of the current state as explained in section II-B We recall the 
following standard notation for partially ordered sets: 


at= {pGP\a<p} , 

at= {qG P\q<a} , 

(101) 

and 

At= IJ at, 

At= [J at - 

(102) 

aeA 

a^A 


Note that in a poc set P, one 

has the identities 


A*t= At*, 

A*t=At* . 

(103) 


For an arbitrary subset A of P we can define the following 
‘correction’ of A\ 


Proposition B.IO (Coherent Projection). Let P be a finite poc 
set and A C P be any subset. Then the set 

coh(A) := y4t \A* At \At* (104) 

Is coherent, and satisfies the following properties: 

1) if A is coherent then coh(A) = At; 

2) coh(A)t= coh(A); 

3) coh(coh(A)) = coh(A); 

4) if A G P° then A = coh(A) = At- 

Proof: To show the coherence of coh(A), let b,c G 
coh(A) with b < c*; if b G At then c* G At and c G At*, 
contradicting c G coh(A). 

For (1), A is coherent iff A t and A* t are disjoint. 
Therefore, if A is coherent then coh(A) = At \A*t= At- 

For (2), let a G coh(A) and a b. Then a G A t implies 
& G A t, and it suffices to verify b ^ At*. Indeed, were 
there c G A with b* > c then a < b < c* would have given 
o G A*t in contradiction of a G coh(A). Thus, b G coh(A), 
as required. 

For (3) since coh(A) is coherent we have coh(coh(A)) = 
coh(A)t by substituting coh(A) instead of A in (1), and then 
we apply (2). 

Finally, for (4), A G P° means A is a coherent complete 
^-selection, so coh(A) = At by (1) and it remains to show 
At= A. Were there 6 G At with & ^ A we would have had 
b* G A, since A is a complete ^-selection. But then we would 
also have had b, b* G coh(A), contradicting the coherence of 
coh(A). ■ 

3) Computing the Projection Maps: For a vertex u G P° 
and any subset A C m, one defines: 

:= (m\ A)UA* (105) 

Clearly, [m]^ is a ^-selection. It is easily verified that [m]^ 
is coherent if and only if there exists no pair a G A and 


Lemma B.ll. Let P be a finite weak poc set and let u G P° 
be any vertex. Then the set N (u) of vertices adjacent to u 
in F = Dual(P) coincides with the set of all [u] , a ranging 
over the minset of u: 

min(M) := {a G u\b < a ^ b ^ u} (106) 

More generally, the cubes in Cube(P) are characterized as 
follows: 

Lemma B.12. Let P be a finite weak poc set and u G P° 
be a vertex. Then the cubes of Cube(P) incident to u are 
in one-to-one correspondence with the transverse subsets of 
min(u). 

A particular application of these observations is an explicit 
construction of a geodesic path in F emanating from a given 
vertex u and terminating at its unique closest point projection 

proj^(y)U: 

Proposition B.13. Let P be a finite weak poc set and suppose 
u G P° is a vertex. Let T be a coherent subset of P. Then 
the following algorithm constructs a shortest path in F from 
uto K ^ [)(T).- 

1) Find an element b G T\u; if no such element, stop and 
output u. 

2) Find an element c < b* with c G min(u); 

3) Replace u by [u] and go to the first step. 

Proof: We have u G K iff T G1 u, which provides the 
stopping condition for the algorithm. Now, if u ^ K and b G 
T \ u then for all u G iT one has v G 1)(5) and u G h(f'*)- 
Since c < b*, we have m G f)(c) C f)(6*), implying v G t)(c*) 
and c G u w. As a result: 

A(u, [m]J = A(u,m) - 1 (107) 

Having reduced A(m,u) by a unit for all v G K, we have 
reduced A.{u,K) by a unit as well. ■ 

Corollary B.14 (Projection of a Point). Let P and T be as 
above. Then the closest point projection to K = f)(T) is given 
by the formula: 

proj^u= (u\T*i)urt= (uUTt) \T*i (108) 

Proof: The second equality follows from the DeMorgan 
rules and the fact that Tf CiT* 1= 0 (since T is coherent). 

Set K = f)(T) and proceed by induction on A.{u,K). If 
A.{u,K) = 0, then u G K and therefore T C u. In addition, 
u is coherent and we conclude T* j. Cm = 0, leaving us with 

u \T* I LIT = uLT = u, 

as desired. Now suppose n := A(u, K) > 0. By the preceding 
proposition, there is a G T* j, Crt such that u := [rt] G P°, 
A.{v,K) = n — 1, and proj^u = proj^?;. We thus have: 

proj^M = proj^u = (u \ T* j.) U rt= {u\T* l)LTj-, 

the last equality being due to a G T* and a* G T. Thus, the 
first identity has been proved. ■ 
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4) Projecting a Convex Set to a Convex Set: 

Proposition B.15. Let K, L be non-empty convex subsets with 
L = f)(S') and K = ()(T). Then 


proj^L = t,((5tUTt)\TU) 

= \Tt*) 


(109) 


Proof: Since T is coherent, T f and T* 1= T f* are 
disjoint. This allows us to write: 

i)((S't urt) \ Tt*) = t)(rtu(S't \Tt*)) 

= t)(rt) n i)(5't 

and the second equality in ( |109| l follows from the identity 
()(r) = [)(Tt). Denote R = Sf xTf* and N = t}{R). 

For every u G L = [}(S') we have SfC u, implying proj^it 
contains Tf Ui?, by corollary B.14 Thus, proj^L C KCN, 
as required. 

For the converse, observe that the case K C\ L 0 was 
already dealt with in lemma |B.9| if iT n L ^ 0, then 

proj^L = KCL = [)(5t) n f)(rt) = HSf urt) 

In particular, S f UT f is coherent, and hence does not intersect 
T*f, and the formula ( |109[ ) holds. 

Thus we may henceforth assume KCL = 0. Equivalently, 


Proof: Recall that coh(T) = T\ xTf*, and set J = Tf 
nTf*, so that rt= coh(T) + J and Tf*= coh(T)* + J. 
Then, 

(S'tUTt)\Tt* = ((5tUcoh(r)U J)\coh(T)*)\ J 
= (S't Ucoh(r)) \ coh(T)* 

Since coh(T)t= coh(T), the last expression equals proj^L, 
by the preceding proposition. The proof of the second equality 
is similar. ■ 


S t nT* 0. In fact, by lemma B.8 we have S f CT* j,= 
sep(A, B). 

Starting with v G K C N we must show v G proj^L. 
Set u = projj^v, w = proj^w, and m = ined{u,v,w). 
Then m G K since v,w G K. Since K C L = 0, we have 
A(m, u) > 0 and A(u, w) > 0. Consider the point m: we have 
m G I{u,w) and m G K', hy the choice of w, m must equal 
w and therefore w G I(u,v). Thus, w = proj^u G I{u,v) 
and u = proj^ru. By proposition B.7 the pair u,w is a gate 
for K, L and we have 

u \ w = sep(L, K) = S't nT* j, . 

Consider an element a G v \ u. Iff) (a) CL 0, pick u' G 
f)(a) n L. Then m = med(u, v, u') will satisfy m G i){a) C L 
as well as 

A{v,L) = A{v,u) = A{v,m) + A{m,u) . 

Now, A{u,m) > 0 since u G f)(a*) and a contradiction to 
u-proj^v is obtained. Thus, t){a) C L must be empty, which 
means L C f)(a*). Applying lemma B.8 we obtain a* G S't- 
Overall, we have shown that v \ u C S f* ■ We will now 
verify that v \ w = 0, finishing the proof Indeed, were it 
not so, there would have been h G v \ w. On one hand, 
w G I{u, v) implies v \ w C v \ u, and hence h* G S f. 
On the other hand, h ^ w means h* G w and therefore h* ^ 
sep(T,iC) = Sf nTf*, which forces h* G R. Since R C v 
(by choice of v), we have h* G v, contradicting our choice of 

h. m 

We will need the following technical corollary for the 
purposes of propagation: 

Corollary B.16. Let S,T G P be subsets and suppose S is 
coherent. Let L = V{S) and K = y(coh(T)). Then: 

proj^T = (StUTt)\Tt*= (Sf \Tt*)Ucoh(T) (110) 








